Generate law-aware sample data
in seconds
We built DataGen Pro to remove the busywork of crafting dummy data by hand. Each schema is inspired by compliance requirements so your QA and PoC flows can rely on realistic records. During beta, rate limits or breaking changes may occur.
Compliance disclaimer
DataGen Pro provides regulation-inspired schemas, but no generated dataset is guaranteed to meet legal requirements. Validate with your compliance or legal team before production use.
Choose a schema
Pick from compliance-inspired templates crafted for QA and staging workloads
FHIR R4 Clinical Bundle
Synthetic FHIR R4 bundles spanning Patient, Observation, Encounter, and AuditEvent resources for interoperability testing.
PCI-DSS Advanced Card Log
Extended PCI-DSS logging with 3-D Secure outcomes, dispute lifecycle markers, and tokenization flags.
ISO 20022 Payments
Cross-border pacs.* message stubs with compliance metadata for treasury and settlement testing.
Card Transaction (PCI-DSS)
Card authorization log conforming to PCI-DSS v4.0 style fields.
Payroll (Basic)
Payroll statements referencing JIS Q 15001 and My Number guidelines.
eKYC Attributes
Identity verification profiles inspired by Japan's AML/KYC guidelines.
Mandatory Medical Checkup (JP LOD)
Sample health exam dataset based on Japan's Industrial Safety and Health Act.
Specific Health Checkup (JP Tokutei)
Extended metabolic screening dataset aligned with the Act on Assurance of Medical Care for Elderly People.
Why teams pick DataGen Pro
Compliance-aware dummy data in minutes, not days
Compliance-minded schemas
Blueprints shaped by occupational health, PCI-DSS, and identity regulations.
High throughput
Generate tens of thousands of rows in seconds to unblock QA runs.
Deep customization
Tune age spans, gender mix, abnormal ratios, approval rates, and more.
API-first
Drop into CI/CD and staging pipelines with a single POST endpoint.
Stateless delivery
Responses stream back in-memory—no generated records are stored server-side.
Free & open-source
Released under MIT License so you can adapt or self-host without friction.
API reference
Integrate via REST with streaming CSV/JSON responses
Request example
{\n "schema": "health-lod",\n "num": 2000,\n "output": "csv",\n "options": {\n "genderRatio": { "male": 0.5, "female": 0.5 },\n "age": { "min": 25, "max": 60 },\n "departments": ["Sales", "Engineering"]\n }\n}Tip: "options" accepts schema-specific knobs such as abnormal ratios or fraud scoring.
Response
curl -s -X POST https://datagen-pro.vercel.app/api/generate \\n+ -H 'Content-Type: application/json' \\n+ -d '{\n "schema": "health-lod",\n "num": 1000,\n "output": "csv",\n "options": {\n "genderRatio": { "male": 0.5, "female": 0.5 },\n "age": { "min": 25, "max": 60 },\n "departments": ["Sales", "Engineering"]\n }\n }' \\n --output health-lod-dummy-1000.csvconst response = await fetch('https://datagen-pro.vercel.app/api/generate', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({\n schema: 'health-lod',\n num: 1000,\n output: 'json',\n options: {\n followUpRatio: 0.1,\n abnormalRatio: 0.12\n }\n })\n});\n\nconst data = await response.json();import requests\n\nresponse = requests.post('https://datagen-pro.vercel.app/api/generate',\n json={\n 'schema': 'health-lod',\n 'num': 1000,\n 'output': 'csv',\n 'options': {\n 'followUpRatio': 0.1\n }\n }\n)\n\nwith open('health-lod-dummy-1000.csv', 'wb') as f:\n f.write(response.content)Pricing
Free during beta. Open-source forever.