jsonlkit.com
JSONL (JSON Lines) utilities, in the browser
Say hi →

JSONL Deduplicator

updated 1 May 2026

100% client-side. No upload.

Deduplicate

Drop a .jsonl file here, or

JSONL Deduplicator

Strip duplicate records out of a JSONL file — by full line, by canonical object (key order independent), or by a specific key path like id or user.email.

Three ways to match

Most "duplicates" in a JSONL file are not actually byte-identical lines. The same record exported twice may have keys in a different order, or one source might add a timestamp the other doesn't have. Pick the matching strategy that fits the cleanup you actually need.

Full line

Compares lines as raw strings. The fastest option, but it will treat {"a":1,"b":2} and {"b":2,"a":1} as different. Use this when you trust the source to emit records consistently — typically logs from a single producer.

Canonical object

Parses each line as JSON, sorts keys recursively, and compares the canonical form. Two records with the same data are treated as equal regardless of how the writer ordered keys. Slower than line compare, but it catches the "I joined two exports" class of duplicate.

Key path

Compares only the value at a specific path. Use id for top-level keys, or dotted paths like user.email or meta.request_id for nested fields. Records where the path is missing are passed through untouched (treated as not participating in dedup) so you don't accidentally collapse them all into one row.

Keep first vs. keep last

Keep first walks the file top-to-bottom and discards any record whose signature has already been seen. Use this when older records are the source of truth.

Keep last retains the most recent occurrence of each signature. Use this for upserts — when a later record represents an update to an earlier one. Output order follows the position of the kept record.

Tips & common pitfalls

Example

Input — same id twice with different ordering:

{"id":1,"name":"alice","ts":1000}
{"id":2,"name":"bob","ts":1001}
{"name":"alice","ts":1000,"id":1}
{"id":3,"name":"carol","ts":1002}

Match by Canonical object, keep first → 3 records. Match by Full line → all 4 kept (key order differs). Match by Key path id → 3 records.

Frequently asked questions

How big a file can it handle?

Limited only by browser memory. Tens of millions of short lines work in modern browsers; if your file is gigabytes, do dedup with a CLI (sort -u for byte-identical lines, or jq piped through awk for key-based).

Does ordering matter for canonical compare?

No — that's the whole point of canonical mode. Object keys are sorted lexically before comparison; arrays preserve their order (because order in arrays is semantically meaningful in JSON).

Can I dedupe on multiple keys?

Not directly. Workaround: pre-process with JSONL → CSV projecting just the keys you care about and dedup the resulting CSV in a spreadsheet, or use canonical mode after stripping irrelevant fields.

Is my data sent to a server?

Never. Everything runs in your browser. The privacy policy is here.

Related tools