JSONL ↔ Parquet
100% client-side. Files never leave your browser; the DuckDB engine itself loads from a CDN on first use (~10 MB, then cached).
JSONL → Parquet
JSONL ↔ Parquet
Convert JSONL to Apache Parquet, or read a Parquet file back to JSONL, in your browser. Useful for handing data to a warehouse (BigQuery, Snowflake, Databricks) or shrinking a JSONL export by 5–10× before storage. Powered by DuckDB-WASM.
Parquet → JSONL
Why Parquet
JSONL is great for streaming, append-only writes, and human inspection. Parquet is the format every data warehouse actually wants to ingest — columnar storage, baked-in type schema, and 5–10× smaller on disk for the same data. Going from JSONL to Parquet is the most common "prepare data for analytics" step in a modern stack.
This page does the conversion entirely in your browser using
DuckDB-WASM,
which exposes DuckDB's read_json_auto and Parquet writer in JS. Schema is
inferred from the first records and applied to the whole file; numbers stay numbers,
booleans stay booleans, nested objects become struct columns.
How it works
- Drop or paste a JSONL file in the top section, click Convert to Parquet.
- DuckDB-WASM loads on first use (~10 MB, cached after that) and parses the JSONL with auto-detected types.
- It writes a Parquet file to an in-memory virtual filesystem, then offers it as a download.
- To go the other way, drop a
.parquetfile in the bottom section and the rows are read back into JSONL.
Tips & common pitfalls
- The first request is slow. Loading the WASM engine takes a few seconds; subsequent conversions on the same page are near-instant.
- Type inference reads the whole file. Unlike streaming JSONL parsers, DuckDB needs to settle on a single schema. If your JSONL has wildly inconsistent types per field, the converter may widen to
VARCHARor fail. - Snappy is the warehouse default. BigQuery, Athena, Snowflake all read Snappy-compressed Parquet without ceremony. Choose Zstd if storage cost matters more than read latency.
- Nested objects become structs. Parquet supports nested types natively, so
{"user":{"id":1}}becomes aSTRUCTcolumn. Some warehouses flatten these on load. - Big files work, just not huge ones. A few hundred MB of JSONL converts fine in a modern browser. Multi-GB files want a server-side
duckdbCLI; this is the in-browser path for everything below that.
Round trip
The JSONL → Parquet → JSONL round trip is almost lossless. Things that survive:
field names, numeric types (with widening from int → bigint as needed), strings,
booleans, null, nested objects, and arrays. Things that may shift: insertion
order of object keys is not preserved by Parquet; NaN/Infinity
numbers don't have a Parquet equivalent and get nulled.
Frequently asked questions
Does this require a server?
No. The conversion is 100% client-side via DuckDB-WASM. The only network request is the one-time fetch of the WASM bundle from jsDelivr; the file you drop never leaves your browser.
Will it work offline?
After the first successful load, yes — the browser caches the WASM bundle. The very first visit on a clean browser needs internet to fetch the engine.
What happens if the JSONL is malformed?
DuckDB will fail the parse and the status bar will report the error. Run the file through the JSONL Validator first to find the bad lines.
Is my data sent to a server?
The data isn't. The DuckDB engine itself is fetched from a CDN on first use, but parsing and writing happen inside your browser tab. See the privacy policy.