Claude (Anthropic) Fine-Tune JSONL Validator
Anthropic Claude fine-tune JSONL validator. Validates the exact shape Bedrock's CreateModelCustomizationJob accepts for Claude 3 Haiku: a top-level system string, a messages array starting with user, strict user/assistant alternation, no extra keys. Runs in your browser, catches Bedrock's exact upload errors before you spend on a job.
Your training data never leaves this tab. Bedrock pulls your file from S3 when a job starts; this pre-flight check is fully local — useful for datasets with PII you don't want to debug twice.
Validate
Claude Fine-Tune JSONL Validator
Validates Anthropic Claude fine-tune JSONL against what AWS Bedrock actually accepts at upload: optional top-level system string (not a message), messages array starting with user, strict user/assistant alternation, plain-string content. Errors map to the exact strings Bedrock returns. 100% in-browser.
Validating a different provider? OpenAI, Google Gemini, Llama / ShareGPT, Mistral.
What this tool does
It parses your fine-tune file one line at a time and checks each example against Anthropic's system + messages shape — the form AWS Bedrock accepts for Claude 3 Haiku customization. For every line it confirms the JSON parses, that messages is a non-empty array, that any top-level system is a string, that roles are user/assistant only and alternate, that the first turn is user, that at least one assistant turn exists, and that every content is a non-empty string (or an array of typed blocks). Failures are listed by line number; a summary tallies examples, valid count, average turns, and an approximate token total. It runs entirely in your browser.
The intent it closes: "I have a Claude fine-tune file and I need to know it will upload to Bedrock before I spend money on a job." Bedrock's pre-flight is minimal — most errors only surface after CreateModelCustomizationJob has been accepted and started — so catching a malformed line here saves a failed run and a debugging round-trip on data you may not want to re-handle.
When you'd reach for it
- Pre-flight before a Bedrock customization job. Paste or drop the file, click Validate, fix what's flagged before it costs you a started job.
- Catch the OpenAI-shape copy-paste mistake. A
{"role":"system",…}entry insidemessagesis the single most common error — it's flagged immediately because Anthropic only allowsuser/assistantthere. - Confirm strict alternation. Two same-role turns in a row, or an assistant-first conversation, both fail the same way Bedrock fails them.
- Salvage a partly-broken file. Download valid examples only writes a clean file with the failing lines stripped.
- Sanity-check size and shape at a glance. The summary's example count and average-turns numbers tell you whether the dataset is what you expected.
- Work with sensitive data. Everything is local, so PII-laden training sets never leave the tab.
Where can I actually fine-tune Claude in 2026?
The single most important thing to know before you start: Anthropic's own API does not offer public fine-tuning. If you visit Anthropic's docs looking for a "fine-tune" endpoint, you'll be routed to a contact form for enterprise customers.
The only managed fine-tuning path for Claude in 2026 is Amazon Bedrock, and as of May 2026 the only Claude model with generally-available fine-tuning is Claude 3 Haiku, in the us-west-2 (Oregon) region.
- Not fine-tunable on Bedrock: Claude 3.5 Haiku, Claude Haiku 4.5, Claude Sonnet 3.5 / 4 / 4.x, Claude Opus 4 / 4.x.
- Not available through Anthropic's API directly — only through Bedrock's
CreateModelCustomizationJob. - After training you have to buy Provisioned Throughput to invoke the fine-tuned model — there's no on-demand inference for custom Claude models. Budget for it.
If you're trying to fine-tune Sonnet or Opus, the answer in 2026 is: you can't, on any managed platform. If you're trying to fine-tune via Anthropic directly, the answer is also: you can't.
The exact JSONL shape Bedrock accepts
Single-turn
{"system":"You are a helpful assistant.","messages":[
{"role":"user","content":"what is AWS"},
{"role":"assistant","content":"It is Amazon Web Services."}
]}
Multi-turn (verbatim from AWS docs)
{"system":"system message","messages":[
{"role":"user","content":"Hello there."},
{"role":"assistant","content":"Hi, how can I help you?"},
{"role":"user","content":"what are LLMs?"},
{"role":"assistant","content":"LLM means large language model."}
]}
Hard rules Bedrock enforces
- System is a top-level string, optional. Never
{"role":"system", ...}insidemessages. - Minimum 2 messages. First message must be
user; last must beassistant. - Strict alternation.
user → assistant → user → assistant…Two same-role turns in a row will fail. - No extraneous keys. Anything outside
systemandmessagesat the top level, or outsideroleandcontenton a message, breaks upload. - Content is a plain string for Bedrock training. Typed-block arrays (
{"type":"text","text":"..."}) work in the runtime Anthropic API but are not the documented training shape. - One JSON object per line. No blank lines, no trailing newline weirdness, no array wrapper.
What's NOT supported in Bedrock Claude training data
- Tool use.
tool_use/tool_resultblocks are not documented as supported in Claude 3 Haiku Bedrock fine-tuning. Training data must be plain text turns. (Tool use at inference on a fine-tuned model still works the same as the base model.) - Vision / images. AWS docs explicitly call Claude 3 Haiku fine-tuning text-only. The "plans to introduce vision capabilities in the future" note is still in the docs as of May 2026.
- Typed content blocks. Bedrock examples use
contentas a plain string. The typed-block array form belongs to the inference API, not the training format.
How validation works
Parse, then check the shape
Each line is parsed as standalone JSON. A blank line is flagged as a problem, not skipped. A parse failure reports the line number and the parser message with a suggested fix. A line that parses is then checked against the Anthropic shape, and every structural problem on that line is listed separately so you see all of them at once rather than one-at-a-time.
What the structural check enforces
The example must be an object with a messages array of at least one entry. A top-level system, if present, must be a string — a {"text":…} object is rejected. Inside messages, each role must be the string user or assistant (so a system role here is invalid), the first message must be user, consecutive same-role turns are rejected, and at least one assistant turn must exist. Each content must be a non-empty string, or an array of typed blocks where each block has a string type (text blocks need a text string, image blocks need a source object).
What the local check does NOT enforce
This matters, so it's spelled out. The validator does not flag extraneous keys — an extra name on a message passes here even though Bedrock rejects it. It does not require the last message to be assistant (only that the first is user, roles alternate, and at least one assistant turn exists), and it does not enforce Bedrock's 2-message minimum, 32-record minimum, or per-record token cap. Typed-block content arrays pass the check even though Bedrock's documented training shape is plain strings. Treat a clean result as "syntactically Bedrock-shaped", not "guaranteed to pass every Bedrock server-side rule".
Summary and clean download
After a run the summary lists total examples, valid examples, average messages per example, an approximate total token count (characters ÷ 4), and total characters. The token figure is a rough estimate for sizing, not Claude's real tokenizer — use the token counter for an exact per-record count against the 32k cap. Download valid examples only rebuilds anthropic-fine-tune-clean.jsonl with the failing lines removed.
What this validator checks
- Each line is valid JSON, one object per line; blank lines are flagged.
- Top-level
messagesis present and is an array with at least one entry. - Optional
systemis a string if present (not an object, not a message). - No message inside
messageshas a role other thanuser/assistant(sorole:"system"is flagged). - The first message role is
user. - Roles strictly alternate
user/assistant— no two same-role turns in a row. - At least one
assistantturn is present. - Every
contentis a non-empty string, or an array of typed blocks (text/image). Typed-block arrays pass — Bedrock's documented training shape is plain strings, so prefer strings for Bedrock even though the check accepts arrays.
Example
A valid example
{"system":"You are a helpful assistant.","messages":[
{"role":"user","content":"What is the capital of France?"},
{"role":"assistant","content":"Paris."}
]}
Optional top-level system, a user turn, an assistant turn. Passes.
Mistakes this validator catches
{"messages":[
{"role":"system","content":"..."},
{"role":"user","content":"Hi"}
]}
// Flagged: role 'system' is invalid — Anthropic puts system at the top level.
{"messages":[
{"role":"user","content":"Hi"},
{"role":"user","content":"Two user turns in a row"}
]}
// Flagged: consecutive 'user' messages — roles must alternate.
{"messages":[
{"role":"assistant","content":"I start the conversation."}
]}
// Flagged: first message must be 'user'.
{"system":{"text":"..."},"messages":[...]}
// Flagged: 'system' must be a string, not an object.
A mistake Bedrock catches but this validator does not
{"messages":[
{"role":"user","content":"Hi","name":"bob"}
]}
// Passes locally, but Bedrock rejects the extraneous 'name' key.
The local check doesn't inspect for extra keys — strip anything outside role and content before uploading.
Recipes by intent
Convert an OpenAI dataset to the Bedrock shape
Move every {"role":"system",…} message to a top-level system string, drop any keys other than role and content, and make sure the conversation starts with user and alternates. Re-validate here, then upload.
Pre-flight a file you're about to push to S3
Validate, fix every flagged line, then Download valid examples only if you want a guaranteed-clean copy. Upload that to S3 and point CreateModelCustomizationJob at it with an IAM role that can read the bucket.
Check you're under the per-record token cap
The summary's token figure is a chars÷4 estimate only. For the real number against Bedrock's 32k-token-per-record limit, run the file through the token counter with the Claude preset.
Split a clean file for training and validation
Once the file validates, use the train / val / test splitter to carve out a validation set (Bedrock caps validation at 1 GB / 1,000 records).
Limits and performance
Bedrock's own limits for Claude 3 Haiku customization:
| Item | Value |
|---|---|
| Minimum training records | 32 |
| Maximum training records | 10,000 (adjustable via Service Quotas) |
| Maximum validation records | 1,000 |
| Maximum tokens per record | 32,000 |
| Training file size | 10 GB |
| Validation file size | 1 GB |
| Region | US West (Oregon) — us-west-2 |
| Inference after training | Provisioned Throughput only |
This page's own limits are your browser's memory: it handles whatever fits in RAM, typically up to ~1 GB on a laptop. Past ~50–100 MB the error list and textarea can lag; for multi-gigabyte files validate on the CLI (jsonlkit validate --anthropic). The error list renders the first 200 problems and notes how many more were found.
Errors and how to fix them
Real Bedrock error strings you'll see if a bad file slips past validation:
| Error from Bedrock | Cause | Fix |
|---|---|---|
Unable to parse Amazon S3 file: {file}.jsonl. Data files must conform to JSONL format. |
File is not strict JSONL — array wrapper, pretty-printed records, blank lines, or trailing-newline-only file. | One JSON object per line, no blank lines, no trailing-only newline, no array. |
Input size exceeded in file {file}.jsonl for record starting with… |
A record exceeds Claude 3 Haiku's 32k-token-per-record quota. | Split or trim the conversation. Use the token counter with the Claude preset. |
Maximum input token count 4097 exceeds limit of 4096 / Max sum of input and output token length … |
Per-sample input/output tokens over limit. | Trim system prompt or shorten turns; run token counter to confirm. |
Automated tests flagged this fine-tuning job as including materials that are potentially inconsistent with Anthropic's third-party license terms. Please contact support. |
Anthropic content-policy filter tripped. | Remove flagged content. Re-submit. If you believe this is wrong, contact AWS support. |
Could not validate GetObject permissions to access Amazon S3 bucket: {bucket} at key {train.jsonl} |
IAM role attached to the customization job lacks s3:GetObject / s3:ListBucket. |
Attach a policy that allows reads on the bucket and key. |
Amazon S3 perms missing (PutObject): … at key output/.write_access_check_file.tmp |
IAM role lacks s3:PutObject on the output bucket. |
Allow writes on the output bucket. |
messages: roles must alternate between "user" and "assistant" |
Two same-role turns in a row, or assistant-first ordering. | Reorder or merge the offending turns — this validator catches it locally first. |
Encountered an unexpected error when processing the request, please try again |
Transient. | Retry. If it persists, open an AWS support case. |
The validator passes but Bedrock says "Unable to parse Amazon S3 file."
Almost always one of: an array wrapper ([ …, … ]) around your records, pretty-printed records (each record spans multiple lines), or a trailing-only newline. Re-export as strict JSONL — one object per line, no blank lines, exactly one newline between records. The formatter's Minify mode produces this.
The validator passes but Bedrock rejects an extra key.
Expected — the local check doesn't inspect for extraneous keys. Strip everything outside system / messages at the top level and role / content on each message before uploading.
I have an assistant-first conversation (model introduces itself).
Bedrock requires alternation starting with user, so prepend a synthetic user turn. The validator flags the assistant-first case for exactly this reason.
I pasted an OpenAI file and everything fails.
OpenAI puts the system prompt inside messages as {"role":"system",…}; Anthropic puts it at the top level as a string. Move it, drop the legacy prompt/completion shape if present, and re-validate.
It warns nothing about my typed-block content, but I'm targeting Bedrock.
The check accepts typed-block arrays because they're valid in Anthropic's runtime API. Bedrock's documented training shape is plain strings, so flatten typed blocks to a string before uploading even though the validator stays quiet.
Is my data uploaded to your servers?
Never. The validator runs entirely in your browser. See the privacy policy.
FAQ
What's the JSONL format for Claude fine-tuning?
An optional top-level system string, plus a messages array of alternating {"role":"user","content":"…"} and {"role":"assistant","content":"…"} objects. First message must be user; Bedrock also wants the last to be assistant. One JSON object per line.
Can I fine-tune Claude Sonnet, Opus, or any Haiku newer than 3?
Not as of May 2026. Only Claude 3 Haiku has generally-available fine-tuning on Bedrock, in us-west-2. Sonnet/Opus/3.5 Haiku/Haiku 4.5 are inference-only.
Does Claude fine-tuning support tool use or images?
No. Training data is text-only conversations. Tool use and vision still work at inference time on the fine-tuned model the same way they do on the base model.
Do I need Provisioned Throughput to use the fine-tuned model?
Yes. Custom Claude models on Bedrock can only be invoked via Provisioned Throughput — there's no on-demand inference. Plan for the cost.
Does this validator guarantee my Bedrock job will succeed?
No — it confirms the file is syntactically Bedrock-shaped. It doesn't check extraneous keys, the 2-message / 32-record minimums, or exact token counts, and it can't see your IAM permissions or content-policy flags. It catches the common structural errors before you spend on a job; Bedrock enforces the rest server-side.