jsonlkit.com
JSONL (JSON Lines) utilities, in the browser
Say hi →

OpenAI Fine-Tune JSONL Validator

updated 28 May 2026 · catches invalid_training_file before upload · separate pages for Anthropic, Gemini, Llama, Mistral

OpenAI fine-tune JSONL validator. Paste a file and see every line that will fail OpenAI's upload checks: missing messages, bad roles, broken tool_calls, content-type mismatches, examples over 16,385 tokens, the legacy prompt/completion shape, and the same seven validation errors the official data-prep cookbook looks for. Runs in your browser, up to 1 GB, nothing uploaded.

Your training data never leaves this tab. OpenAI uploads the file when you start a job; this pre-flight check is fully local — useful when the dataset has PII you don't want to round-trip twice.

⌨ Prefer the terminal? jsonlkit validate --openai training.jsonl — same checks, in a pipe.

Validate

Drop a fine-tune .jsonl file here, or

OpenAI Fine-Tune JSONL Validator

Validate your file against the two formats OpenAI's training API actually accepts: the modern chat shape ({"messages":[…]}) used by gpt-4.1, gpt-4o, and gpt-4o-mini, and the legacy prompt / completion shape that still runs on babbage-002 and davinci-002. Every line is parsed, every message is checked for role, content, and structure, and the result is mapped to the exact error code OpenAI returns at upload — so you can fix the source data before the job fails.

For other providers we have dedicated pages: Anthropic (Claude), Google Gemini, Llama 3 / ShareGPT / Alpaca, Mistral.

What this tool does

It runs every check from OpenAI's official chat_finetuning_data_prep cookbook against your file, locally, before you upload — plus the upload-time errors the cookbook misses (token-limit overshoot, BOM-corrupted UTF-8, JSON-array wrapping). Each problem is mapped to a specific line number and the exact OpenAI error string you'd otherwise see in the dashboard.

The intent it closes: "I don't want to find out my training file is broken after the upload-and-queue cycle, or after I've paid for a botched run." A bad invalid_training_file error costs the time to upload (minutes for big files) plus the queue wait plus the round-trip to find which of 10,000 lines is broken. This page returns the same verdict in seconds, against the same rules, without sending your data anywhere.

When you'd reach for it

How validation works

The pipeline runs three passes per click of Validate.

1. Parse every line as JSON

Each non-empty line is parsed as a standalone JSON value. Parse failures are reported with the line number, the parser's column hint, and a suggested fix (Python-repr quotes, trailing commas, smart quotes, BOM). Blank lines are flagged separately rather than silently ignored — they're a common source of file contains 0 valid examples.

2. Check shape against the active format

For OpenAI chat: the line must be an object with a messages array of at least 2 entries; every message must have a valid role (system, developer, user, assistant, tool, function); every message must have content (or, for assistant turns, tool_calls / function_call); content arrays must contain typed parts; tool messages need tool_call_id; function messages need a string name; at least one assistant turn must exist somewhere in the example. For legacy prompt/completion: both fields must be non-empty strings.

3. Aggregate, count, estimate tokens

All errors across all lines are collected into one list with a line number per error — no early exit, so a single bad row doesn't hide the next thousand. Valid lines feed an approximate token count (chars ÷ 4 — fast, not exact; use the token counter when you need a real tiktoken number). The summary block reports total examples, valid examples, average messages per example, total characters, and the rough token estimate.

The OpenAI fine-tune JSONL format, in 30 seconds

Each line is one independent JSON object. Each object has a messages array. Each message has a role and a content (with a few exceptions for tool calls). Every example must end on an assistant turn — that's the target the model learns to produce.

{"messages":[
  {"role":"system","content":"You are a terse assistant."},
  {"role":"user","content":"2+2?"},
  {"role":"assistant","content":"4"}
]}

Allowed roles in 2026: system, developer, user, assistant, tool. The legacy function role still parses but the modern equivalent is tool.

Limits cheatsheet (2026)

ItemValue
Minimum training examples10 (≈ 50 to see signal, ≥ 100 for production)
Maximum tokens per example16,385 (gpt-4o family); anything over is truncated or rejected
Maximum training file size512 MB per upload (most accounts)
File formatUTF-8 JSONL, one record per line, purpose=fine-tune
Last message of every examplemust be assistant
Models with SFT (2026)gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-mini
Models with RFT (reinforcement)o4-mini
Legacy prompt/completionbabbage-002, davinci-002 only · new training disabled 28 Oct 2024 · existing FTs still callable

What this validator checks

Every check from OpenAI's official chat-finetuning data-prep cookbook, plus a handful of upload-time errors that the cookbook misses. Each problem maps to an error code you can grep for:

Cookbook error codeWhat it meansFix
data_typeLine isn't a JSON objectEach line must be a single {…}, not an array, not a string
missing_messages_listNo messages keyWrap the conversation in {"messages":[…]}
message_missing_keyA message lacks role or contentAdd both; content may be null only when tool_calls is present on an assistant turn
message_unrecognized_keyKeys outside the allowed set (role, content, name, function_call, tool_calls, tool_call_id, weight, refusal)Remove extras like solution, final_answer, metadata
unrecognized_roleRole is not system / developer / user / assistant / tool / functionFix typos like asistant, usre
missing_contentEmpty or missing content on a non-tool-call assistant turnEither fill content or make it a tool_calls turn with content:null
example_missing_assistant_messageNo assistant turn anywhere in the exampleAdd one — that's the target the model trains on

How to use it

  1. Drop a .jsonl file into the dashed box, or paste records directly.
  2. Pick the format. Default is OpenAI chat. Switch to legacy prompt/completion only for old babbage-002/davinci-002 datasets.
  3. Click Validate. Each line is checked against the seven cookbook rules plus token-limit and structural checks.
  4. Inspect the Error List — every issue has a line number, an error code, and a one-line explanation.
  5. Download valid examples only rebuilds a clean file with broken lines stripped out — useful when 5–10 rows are unsalvageable and the rest are fine.

Examples for every supported shape

Minimal chat example

{"messages":[{"role":"system","content":"You are a terse assistant."},{"role":"user","content":"2+2?"},{"role":"assistant","content":"4"}]}

Multi-turn with the weight field

Set "weight": 0 on an assistant turn to keep it in the conversation context but exclude it from the training loss. Useful when you want to show prior assistant turns for context without teaching the model to imitate them.

{"messages":[
  {"role":"system","content":"Marv is a sarcastic chatbot."},
  {"role":"user","content":"What's the capital of France?"},
  {"role":"assistant","content":"Paris, as if everyone doesn't know already.","weight":0},
  {"role":"user","content":"Who wrote Romeo and Juliet?"},
  {"role":"assistant","content":"Shakespeare. Original, I know.","weight":1}
]}

The developer role (replaces system on o-series and GPT-4.1)

For o1, o3, o4-mini, and the GPT-4.1 family, OpenAI recommends the developer role in place of system. Both are still accepted; developer is the future-proof choice.

{"messages":[
  {"role":"developer","content":"Reply in JSON only."},
  {"role":"user","content":"Color of grass?"},
  {"role":"assistant","content":"{\"color\":\"green\"}"}
]}

Tool calling — full round-trip

Fine-tuning tool use means teaching the model when to emit a tool_calls array (with content:null), then how to compose the final answer after a tool message with the function result arrives. Every assistant turn that calls a tool must have a matching tool turn next, sharing the same tool_call_id. The top-level tools array describes available functions; parallel_tool_calls defaults to true.

{"messages":[
  {"role":"user","content":"What's the weather in SF?"},
  {"role":"assistant","content":null,"tool_calls":[
    {"id":"call_1","type":"function","function":{"name":"get_weather","arguments":"{\"city\":\"San Francisco\"}"}}
  ]},
  {"role":"tool","tool_call_id":"call_1","content":"{\"temp\":62,\"unit\":\"F\"}"},
  {"role":"assistant","content":"It's 62 F in San Francisco."}
],
"tools":[{"type":"function","function":{
  "name":"get_weather","description":"Get current weather",
  "parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}
}}],
"parallel_tool_calls":false}

Vision fine-tuning

Multimodal SFT works on gpt-4o family. Each content becomes an array of typed parts; image_url can point at a public URL or a base64 data URI.

{"messages":[
  {"role":"user","content":[
    {"type":"text","text":"What's in this image?"},
    {"type":"image_url","image_url":{"url":"https://example.com/cat.jpg"}}
  ]},
  {"role":"assistant","content":"A tabby cat."}
]}

Legacy prompt / completion (babbage-002, davinci-002)

Only used for the two remaining base models. New training runs were disabled on 28 Oct 2024 for everyone except customers with existing fine-tunes. Strings must be non-empty, and OpenAI's old convention is that the completion starts with a single leading space and ends with a stop sequence.

{"prompt":"Translate to French: hello ->","completion":" bonjour\n"}

Exact OpenAI error strings — what you see when upload fails

These are the messages the OpenAI dashboard, the Python SDK, and the CLI actually return. Search them verbatim if you've already hit one; otherwise read down to see what this validator catches in advance.

Error messageWhat's actually wrongFix
The job failed due to an invalid training file. Unexpected file format, expected either prompt/completion pairs or chat messages. Different lines have different shapes, or the file is a single JSON array [ {…}, {…} ] instead of one object per line. Use real JSONL (newline-separated objects). Pick one schema and stick to it.
Invalid file format for Fine-Tuning API. Must be .jsonl Wrong extension or wrong purpose on upload. Rename to .jsonl; upload with purpose="fine-tune".
Line N, message M, key 'content.str': Input should be a valid string content is an object/number/null where a string is expected. Stringify the value, or use the parts-array form for vision.
Example N contains invalid tokens UTF-8 / BOM / surrogate pair issue — common when Python writes with ensure_ascii=True and bad escapes. Re-encode as UTF-8 without BOM. Use json.dumps(..., ensure_ascii=False).
At least one message must be from the assistant Example ends on a user turn — no target for the model to learn. Append an assistant message.
Example N exceeds the maximum token limit of 16,385 tokens Single example too long after tokenization. Split the conversation, drop earlier history, or trim the system prompt.
File contains 0 valid examples Lines look like Python repr ('role' single quotes), or have trailing commas, or the whole file is a JSON array. Use json.dumps per line — never str(dict).
Training file must contain at least 10 examples Fewer than 10 lines after deduplication. Add more rows.

Recipes by intent

Pre-flight a chat fine-tune before upload

Format OpenAI chat. Drop the file. If the status bar says "Valid OpenAI chat file. N / N examples OK", upload with confidence. If anything is red, fix the listed lines first — every problem here would also fail upload.

Recover an already-rejected upload

Download the original file from your local export (don't try to fetch it back from OpenAI — they don't expose it). Paste in. The line numbers in the error list map 1:1 to the file you uploaded. Cross-reference with the OpenAI error string table below to identify the exact failure mode.

Strip unsalvageable rows and re-upload

Click Download valid examples only. This re-runs validation, keeps lines that produce zero errors, and writes a fresh openai-fine-tune-clean.jsonl. Useful when 5–10 rows of 10,000 are unfixable and you'd rather drop them than block the training run.

Migrate from prompt/completion to chat

Validate twice: first as legacy prompt/completion to confirm the old file is sound, then convert (most teams do this in code), then validate the result as OpenAI chat. Watch for missing assistant turns — a common migration mistake is using completion verbatim as the assistant content without wrapping it in a message.

Estimate cost before training

The summary shows an approximate total token count (chars ÷ 4). Multiply by your epochs and the per-token training price for the target model — that's a usable order-of-magnitude estimate. For exact numbers, run the token counter.

Errors and how to fix them

"invalid JSON" on a line

The line itself isn't valid JSON. Common causes: trailing commas, single-quote keys from Python's str(dict), unescaped newlines inside strings, stray NaN / Infinity, or unescaped backslashes. The JSONL auto-fixer repairs most of these mechanically. If you generated the file in Python, use json.dumps(record, ensure_ascii=False) per line, never str(record).

"missing 'messages' key" or "expected JSON object, got array"

Your file is probably a single JSON array ([ {…}, {…} ]) rather than line-delimited objects. Use JSONL ↔ JSON to flip the array into JSONL, then re-validate.

"messages[i].role 'X' is invalid"

A role typo (asistant, usre, assistat) or an unsupported role. Allowed values: system, developer, user, assistant, tool, function. Anything else fails upload silently — this validator catches it before the round-trip.

"no 'assistant' message found"

The example has no assistant turn. The assistant turn is the only thing the model is trained to predict, so an example without one carries no training signal and fails upload. Add an assistant message — usually as the final turn.

"messages[i].content is null" or "is an empty string"

null content is only allowed on an assistant turn that has tool_calls or function_call. Empty strings are never allowed for non-assistant roles. Either fill the content or restructure the turn.

"tool message missing 'tool_call_id'"

Every tool-role message must reference the assistant tool_calls[].id that asked for it. Pair them up; the IDs are how OpenAI matches request to response inside the turn graph.

Local validation passes but OpenAI still rejects

Three usual suspects. One: a single example exceeds 16,385 tokens after real tokenization (the page's chars/4 estimate is too generous); run the token counter on suspicious lines. Two: the file has a UTF-8 BOM or stray non-printables; re-save as UTF-8 without BOM. Three: wrong upload purpose — must be fine-tune, not assistants or batch.

Browser is lagging on a large file

Use the file-drop zone instead of paste — drag-and-drop reads from disk without round-tripping through the textarea. This tool is happiest under 200 MB. For multi-GB files, prefer the CLI: jsonlkit validate --openai training.jsonl.

FAQ

Is my data uploaded?

Never. There's no backend — the validator runs entirely in your browser. Disconnect from the internet after the page loads and it still works. Safe for files with PII or proprietary prompts.

How many examples do I need to fine-tune?

OpenAI's minimum is 10. You'll see meaningful signal at ~50, and most production fine-tunes run on 100–10,000 examples. Quality and consistency matter more than quantity — adding noisy examples often hurts more than it helps.

What does the weight field do?

weight: 0 on an assistant turn keeps it in the context but excludes it from the loss. weight: 1 is the default. Useful when you want prior assistant turns for grounding without teaching the model to imitate them.

What's the developer role?

It replaces system on the o-series (o1, o3, o4-mini) and the GPT-4.1 family. Both are still accepted; developer is the recommended choice going forward and this validator treats them identically.

How does this differ from running the cookbook script locally?

It runs the same checks in the browser instead of needing a Python install. The cookbook script is authoritative — when in doubt, run both; the verdict should match line-for-line. The advantage here is the interactive error list with line numbers and the "download valid examples only" filter.

Does the token count match tiktoken?

No — the page uses a chars/4 heuristic so it stays fast and dependency-free. It's accurate enough to flag examples that are obviously over 16,385, but for budget calculations or close-to-cap rows, use the token counter page, which runs a real tokenizer.

Related tools

See also: if your file is broken in unrelated ways, JSONL auto-fixer repairs trailing commas, smart quotes, BOMs, and Python-repr quotes; Formatter pretty-prints or minifies each record; JSONL → CSV flattens for spreadsheet review.