jsonlkit.com
JSONL (JSON Lines) utilities, in the browser
Say hi →

JSONL Specification

Formal rules · edge cases · parser differences · updated 21 May 2026 · overview · vs JSON · examples · best practices

JSONL is not standardized by an RFC or ISO, but it has a de-facto specification with strong consensus across implementations. This page documents the rules in formal terms, calls out the spots where parsers diverge, and explains what conformant producers and consumers should do.

The five formal rules

  1. The file is a sequence of records, separated by line breaks.
  2. Each record is exactly one valid JSON value (RFC 8259), with no embedded unescaped newlines.
  3. The line break is \n (U+000A LINE FEED). Implementations should also accept \r\n (CRLF) on input.
  4. The file's text encoding is UTF-8 without a byte-order mark (BOM).
  5. A trailing newline after the last record is recommended but not required.

Grammar

In ABNF, with rules borrowed from RFC 8259 (JSON):

jsonl-file   = *(record LF) [record [LF]]
record       = json-value      ; one RFC 8259 value, encoded on a single line
LF           = %x0A            ; the line feed character

json-value   = false / null / true / object / array / number / string
;             (full definition from RFC 8259 §3)

The grammar deliberately allows zero records (an empty file is valid JSONL) and allows the file to end without a trailing newline.

What "valid JSON per line" actually means

Each line, in isolation, must parse with JSON.parse (JavaScript), json.loads (Python), or any other RFC 8259-conforming parser. That means:

Encoding

JSONL is UTF-8. The historical JSON spec allowed UTF-16 and UTF-32 too, but every modern producer and consumer uses UTF-8. Tools that emit BOM (EF BB BF at the start of the file) cause more problems than they solve — many parsers treat the BOM as part of the first record's first character. Do not emit a BOM. Consumers should be tolerant: strip a leading BOM silently if present.

Non-ASCII characters are valid two ways:

Both are equivalent. Most producers emit raw UTF-8 for readability and only escape control characters and quotes; ASCII-only output is achievable by escaping every non-ASCII codepoint.

Line endings

SequenceSpec statusWhat conformant parsers do
\n (LF, 0x0A)CanonicalAlways accept
\r\n (CRLF, 0x0D 0x0A)Tolerated on inputAccept, normalize to \n on output
\r alone (CR, 0x0D)DiscouragedSome parsers split on it (Python's splitlines), most don't. Do not produce.
U+2028 / U+2029 (line/paragraph separators)Not a record boundaryIf inside a JSON string, must be escaped to / (per RFC 8259 §7).

Producers should always emit \n. Consumers should accept \n and \r\n. The auto-fixer normalizes line endings to LF.

Whitespace and blank lines

The spec is silent on blank lines, but in practice:

To stay maximally compatible: produce no blank lines, and consume tolerantly (skip whitespace-only lines).

Record content rules

Any RFC 8259 value is allowed per line. In practice:

The maximum size of a single record is bounded by your parser. Most modern parsers handle multi-megabyte records, but if you have a very large nested object, the record may not fit one line for readability — split into multiple smaller records keyed by parent ID instead.

MIME type

There is no IANA-registered MIME type for JSONL. The de-facto choices, in order of recognition:

  1. application/x-ndjson — most widely recognized by HTTP clients (curl, httpie, Postman), libraries, and CDNs. Recommended for HTTP.
  2. application/jsonl — emerging convention used by some newer APIs.
  3. application/json-lines — used by a handful of services.
  4. application/jsondo not use this for JSONL. Clients will try to parse the whole body as a single document and fail.
  5. text/plain — works in a pinch (downloads as a text file) but loses type info.

For HTTP streaming (chunked transfer-encoded responses), application/x-ndjson is the standard.

Streaming and chunked transfers

One of JSONL's main advantages is streamability. For HTTP:

For file IO: parsers should read line-by-line, parse, and emit (or callback) per record, rather than slurping the whole file. Python's for line in open(file), Node's readline module, Go's bufio.Scanner, and Rust's BufReader::lines are the canonical patterns.

Compression conventions

Gzip's framing supports concatenation, so you can cat a.jsonl.gz b.jsonl.gz > combined.jsonl.gz and consumers will see one continuous stream — the same trick works for plain JSONL via cat.

What the spec deliberately doesn't say

Conformance checklist for producers

Conformance checklist for consumers

Run any file through the validator to check it against these rules, or the auto-fixer to make it conformant.

FAQ

Is there an official RFC for JSONL?

No. There is no IETF RFC or ISO standard. The closest thing to a canonical specification is jsonlines.org, which codifies the conventions. The format relies on RFC 8259 (JSON) for the per-line rules.

Can a record span multiple lines?

No. Records are delimited by line breaks. Any embedded newline inside a string value must be escaped as \n. Multi-line records would break every parser.

How do consumers handle a malformed line in the middle of a file?

The robust pattern is: log the line number and error, then continue with the next line. Failing the whole file is brittle for large datasets. Strict ETL pipelines may choose to fail-fast — both approaches are valid; the producer-consumer contract should make the choice explicit.

Does the format support comments?

No. JSON forbids comments and JSONL inherits that. If you need annotations, put them in a dedicated field like "_comment" per record. The auto-fixer can strip // and /* */ from files that incorrectly include them.

Why not just use a JSON array?

Because you can't stream a JSON array. The [ must come first, every record must be followed by a comma except the last, and the ] must come at the end. None of that is friendly to appending or to consumers that want to process records as they arrive. JSONL solves both with one rule change.

— S., [email protected]