How to generate a random CSV
The fastest way: open the generator at the top of this page, set the row count, define your columns, and click Generate. The output is a real CSV — copy it to clipboard or download generated-<n>-rows.csv. Runs entirely in your browser; nothing is uploaded.
- Set the row count — anywhere from 1 to 50,000 rows.
- Add columns — click Add column, name it (e.g.
email,signup_date,score), and pick a type from the dropdown:- UUID —
crypto.randomUUID()(RFC 4122 v4) - First name / Last name / Full name — drawn from a pool of ~30 names each
- Email —
firstname.lastname@domainlowercased, with fictional domains (example.com,mail.test, …) - Username —
firstnameXXX### - Company / City / Country / Street address / Phone — sample data
- Date / Datetime — uniform random within the range you set
- Integer / Float — uniform random with min/max (and decimals for floats)
- Boolean — 50/50
true/false - Choice from list — comma-separated values, e.g.
active, inactive, pending - Lorem (sentence) — a short Lorem-ipsum sentence
- UUID —
- Click Generate to roll a fresh dataset (the preview shows the first 10 rows).
- Download .csv or Copy CSV to clipboard.
The default columns when you first land here — id, first_name, last_name, email, country, signup_date — are a sensible starting point for a fake users table. Edit, remove, or add more.
Common test data scenarios
Load testing an import endpoint
You’re shipping a “Bulk import users” feature and need to know how it behaves at 10K rows. Generate a CSV with the exact columns your importer expects (email, name, role, department), set the row count to 10,000, download, and feed it to your /api/import endpoint. Vary the row count to find where p95 latency starts climbing.
Dev seed data
Your local Postgres is empty and the dashboard looks broken without rows. Generate 500 fake users + 2,000 fake orders, then pipe through the CSV to SQL converter to get INSERT statements you can paste into psql.
Dashboard demos and screencasts
Recording a Loom of your analytics tool? Real customer data is a privacy nightmare and looks unprofessional when names get blurred. Generate 1,000 rows of plausible-looking signups, dates spread over the last 12 months, countries weighted however you like — your charts fill out and nothing identifying ever appears.
Training datasets and ML smoke tests
You don’t need real labels to verify your data pipeline runs end-to-end. A CSV with feature_1 (float, 0–1), feature_2 (integer, 0–100), label (choice: A, B, C) is enough to confirm the loader, batcher, and trainer all work before you swap in the real dataset.
QA fixtures and bug repros
A bug is reproducible “with about 50 rows where 3 of them have empty emails.” Generate 50 rows, hand-edit 3 cells to be empty, save as repro.csv, attach to the ticket. Faster than crafting a fixture by hand.
How to generate CSV data in Python (Faker)
For larger datasets, repeatable seeds, locale support, and custom providers, use Faker:
import csv
from faker import Faker
fake = Faker()
Faker.seed(42) # reproducible output
with open("users.csv", "w", newline="") as f:
w = csv.writer(f)
w.writerow(["id", "first_name", "last_name", "email", "country", "signup_date"])
for i in range(10_000):
first = fake.first_name()
last = fake.last_name()
w.writerow([
fake.uuid4(),
first,
last,
f"{first}.{last}@{fake.free_email_domain()}".lower(),
fake.country(),
fake.date_between(start_date="-3y", end_date="today"),
])
Pin the seed to get the same dataset every run — useful for committed fixtures. For locale-specific data (fr_FR, ja_JP, …), pass it to Faker(locale="fr_FR"). For relational data, generate parents first, then sample from the parent IDs:
user_ids = [fake.uuid4() for _ in range(1000)]
# ... write users.csv ...
with open("orders.csv", "w", newline="") as f:
w = csv.writer(f)
w.writerow(["order_id", "user_id", "amount", "created_at"])
for _ in range(5000):
w.writerow([
fake.uuid4(),
fake.random_element(user_ids), # foreign key
fake.pyfloat(min_value=5, max_value=500, right_digits=2),
fake.date_time_between(start_date="-1y"),
])
How to generate CSV data in JS (faker.js)
The JS port — @faker-js/faker — works in Node and modern browsers:
import { faker } from "@faker-js/faker";
import { writeFileSync } from "node:fs";
faker.seed(42);
const rows = [["id", "first_name", "last_name", "email", "country", "signup_date"]];
for (let i = 0; i < 10_000; i++) {
const first = faker.person.firstName();
const last = faker.person.lastName();
rows.push([
faker.string.uuid(),
first,
last,
`${first}.${last}@${faker.internet.domainName()}`.toLowerCase(),
faker.location.country(),
faker.date.past({ years: 3 }).toISOString().slice(0, 10),
]);
}
writeFileSync("users.csv", rows.map((r) => r.join(",")).join("\n"));
For browsers, swap the Node fs write for a Blob + URL.createObjectURL download — same pattern this page’s generator uses.
How to generate CSV data in SQL
If you already have a database open, you can generate fake data directly. PostgreSQL example using generate_series:
COPY (
SELECT
gen_random_uuid() AS id,
(ARRAY['Ada','Linus','Grace','Alan'])[ceil(random()*4)] AS first_name,
(ARRAY['US','PT','DE','FR'])[ceil(random()*4)] AS country,
(CURRENT_DATE - (random() * 1000)::int) AS signup_date,
(random() * 1000)::numeric(10,2) AS amount
FROM generate_series(1, 10000)
) TO '/tmp/users.csv' WITH CSV HEADER;
Quick, but limited: no realistic names without a lookup table. The browser generator above and Faker handle that out of the box.
Why use realistic test data
Bad test data is a silent source of bias and false confidence. Common anti-patterns:
- Sequential garbage —
test1,test2,test3for names. Sorts alphabetically, has uniform length, and your UI’s wrapping behavior never gets exercised. - All-the-same — every email is
test@test.com. Your unique constraint never fires; your dedupe logic appears to work but doesn’t. - No edge cases — every name is 5 ASCII letters. The first time a real
MüllerorO'Brienshows up in production, your CSV exporter or SQL escaper breaks. - Identical timestamps — every row is created at the same instant, so your time-series chart looks like a single bar.
Realistic test data — varied lengths, mixed scripts (when you ask for them), proper unique values, and spread-out dates — surfaces these bugs before users do. The generator above gives you that variety by default; Faker gives you even more, including locale-specific edge cases (fa_IR, zh_CN, names with diacritics).
Common gotchas
Uniqueness isn’t guaranteed
The generator samples with replacement. With 10,000 rows and only 30 first names, you’ll see lots of duplicates — that’s by design (mimics real data) but can violate a UNIQUE constraint if you point it at the wrong column. UUIDs are effectively unique (collision probability ~0). For unique emails, add a counter: generate the email, then suffix +i per row in a script.
Referential integrity isn’t generated
Each column is independent. If you need orders.user_id to point at users.id, generate users.csv first, then in a small script build orders.csv by sampling user IDs. The generator can’t do this in-browser without a more elaborate UI — Faker (above) handles it cleanly.
Locale and character sets
The built-in name pool is Latin-script. If you’re testing CJK rendering, RTL layout, or emoji handling, switch to Faker with locale="ja_JP" / "ar_SA" / "zh_CN", or hand-edit a downloaded CSV.
GDPR and privacy — never use real PII
It’s tempting to dump a slice of your production users table into a CSV for debugging. Don’t. Even when “anonymized” by removing email, real names + dates + cities are often re-identifiable. Use generated synthetic data instead — that’s the entire point of this tool. The output here contains no real person’s information by construction.
Numbers in CSV are still strings
The CSV format has no types — every cell is text. The generator emits 42 and 3.14 without quotes, but when re-imported, your loader must cast them. The CSV to SQL converter on this site auto-detects numeric columns when you import the generated file.
Privacy: nothing is uploaded
Generation runs entirely in your browser using a small in-house data generator and PapaParse for CSV serialization. UUIDs come from crypto.randomUUID() (also browser-side). No file, column definition, or generated row ever reaches a server — verify in DevTools → Network and you’ll see nothing beyond the initial page load.
That makes the generator safe to use on locked-down corporate machines, on planes, and for fixtures that will live in private repos: the data is synthetic, but the generation process is also private.
After generating, you can view the CSV, convert it to Excel for stakeholders, or pipe it into SQL for seed data — all client-side, all on this site.
Related tools
Convert any CSV file to a real .xlsx Excel workbook in seconds. Free, no signup, files never leave your browser.
Turn any .xlsx or .xls Excel file into a clean CSV. Pick the sheet, pick the delimiter, download. No upload.
Convert any Excel workbook (.xlsx or .xls) to a printable PDF in seconds. Pick the sheet, pick orientation, download. 100% private.
Convert any CSV file to a clean PDF table in seconds. Free, no signup, files never leave your browser.
Frequently asked questions
- What can I use the generated CSV for?
Anything that needs realistic-looking tabular data: seeding a dev database, load-testing an importer, building dashboard demos, recording screencasts, training datasets, QA fixtures, or filling out a sample report. The data is fake but plausible — names, emails, dates, countries — so screenshots and demos look real.
- How do I customize the columns?
Click 'Add column', give it a name, pick a type (UUID, first name, email, integer with min/max, date range, choice from a list, etc.), and the preview updates instantly. Remove a column with the × button. The generator supports up to 18 column types out of the box.
- What's the maximum number of rows?
50,000 rows. The generator runs in your browser on a single thread, so very large outputs (40K+) take a few seconds and produce a CSV file in the multi-megabyte range. For million-row datasets, use Faker (Python) or faker.js (Node) — see the article below for example scripts.
- Is the data realistic enough for demos?
Yes. Names, last names, cities, countries, and companies are drawn from a curated pool of real-world-looking values. Emails follow the firstname.lastname@domain pattern. Phone numbers, UUIDs, and dates use proper formats. It's good enough that nobody watching your screencast will notice it's fake.
- Is this GDPR-safe? Can I use it instead of real customer data?
Yes — and you should. The generator only produces synthetic data drawn from fictional name pools and made-up domains (example.com, mail.test, etc.). No real person's PII ever appears. This makes it safe to share datasets with contractors, paste into bug reports, or include in public demos without a DPA.
- Are the values reproducible? Can I get the same data twice?
Not yet — every click of Generate produces a fresh random sample. If you need a reproducible dataset (e.g., for a test fixture), download once and commit the CSV to your repo. For seeded reproducibility in code, switch to Faker with a fixed seed.
- Can I generate referentially consistent data (foreign keys)?
Not directly — each column is independent. For related data (a users.csv and an orders.csv that reference user_ids), generate the parent table first, download it, then write a small script that picks user_ids at random when generating the child table. Faker with a custom provider handles this elegantly.
- Is anything uploaded?
No. Generation runs entirely in your browser using JavaScript and PapaParse (for CSV serialization). No file or input ever reaches a server. Verify in DevTools → Network — you'll see no requests beyond the initial page load.