JSONL Anonymizer
JSONL anonymizer. Redact PII (emails, phone numbers, IPs, credit cards, SSNs) in a JSONL (JSON Lines) file before sharing or using it for AI training. Mask, hash or remove. Up to 1 GB, in your browser.
Redact
Before you start
This tool scrubs personally identifiable information from a JSONL file before you share it, post it as a sample, or feed it into AI training. It's a helper, not a compliance officer — combine it with a human review.
How to use it
- Drop a file or paste JSONL.
- Tick which detectors to run: Email, Phone, IP, Credit card, SSN.
- Pick a Strategy:
- Mask — keep first/last char, replace middle (good for spotting duplicate emails without revealing identity).
- Hash — replace with an 8-hex non-crypto hash (deterministic, so the same email always maps to the same token).
- Remove field — when used together with the always-redact key list, the key is deleted entirely.
- Optionally list always-redact keys (e.g.
password,user.ssn,auth_token). Dot.notation supported. - Click Anonymize, then Copy or Download.
What gets detected
- Email — standard RFC-like pattern.
- Phone — sequences of digits with optional
+, spaces, parens, hyphens; 10+ digits. - IP — IPv4 dotted-quad and IPv6 colon-separated.
- Credit card — 13–19 digits with optional separators (covers Visa, MC, Amex shape; doesn't Luhn-check).
- SSN — US Social Security
XXX-XX-XXXXpattern.
Example
Input:
{"user":"[email protected]","phone":"+1 (555) 123-4567","note":"Backup IP 10.0.0.5"}
With mask strategy:
{"user":"a***m","phone":"+***7","note":"Backup IP 1***5"}
Tips & common pitfalls
- Use Hash for analytics. If you need to count unique users without seeing them, hash is the strategy you want.
- Use Mask for QA. Mask preserves enough shape that humans can verify the row is what they expected.
- Detectors are best-effort. Unusual phone formats (international, "555-CALL-NOW") may slip through. Combine with explicit keys when you know what's sensitive.
- Run twice. First pass scrub PII via detectors; second pass remove fields you know are sensitive even if they look innocent (
internal_user_id,device_fingerprint).
Frequently asked questions
Is this safe for HIPAA / GDPR?
No tool is by itself. This is a quick scrub; for regulated data, talk to your DPO and use a server-side, audited tool.
Will the same email always hash the same way?
Yes — the hash is deterministic within a single page load. You can use it to count unique users without revealing them. (It's a tiny non-crypto hash, so it's not safe for adversarial attackers, but it's fine for "I want to count distinct users in a shared log".)
Are my files uploaded?
No. Everything runs in your browser.