Sample data for MongoDB

Flat JSON documents you can import straight into MongoDB or any NoSQL store — realistic transactions for testing queries, indexes and aggregations.

Retail POSSeeded - reproducibleCSV / Excel / JSON / SQL100% in-browser

Generate & download

Save / load scenario (stored only in this browser)

About this dataset

This is a free, reproducible Retail POS dataset you can generate and download right here as CSV, Excel, JSON or SQL. It is built for mongoDB / NoSQL testing, aggregation pipelines and index & query tuning — and because every field is correlated rather than random, the numbers actually hold together when you analyze them.

The catalog is organized into real affinity groups (e.g. chips + salsa + soda) that co-occur within baskets, so an association-rule miner actually surfaces lift — exactly what a market-basket exercise needs.

Columns in this dataset

Schema for the Retail POS export (the anomaly column appears only when labels are switched on):

ColumnTypeDescription
transaction_idintegerThe basket; item rows share it.
datetimedatetimeTimestamp with realistic hour-of-day weighting.
store_idtextWhich store rang the sale.
product / departmenttextItem and its aisle/department.
quantity / unit_pricenumberUnits and shelf price.
line_totalnumberquantity x unit_price.
paymenttextCard / Cash / Mobile.
anomaly0/1Present with labels on; flags suspicious transactions.

Import into MongoDB

mongoimport --db demo --collection retail_pos \
  --file retail_pos.json --jsonArray

Good for

MongoDB / NoSQL testingAggregation pipelinesIndex & query tuningSeed data

Related sample datasets

FAQ

How big is this dataset?

Around 6,000 rows by default. Change the row count in the generator above and re-export — anything up to ~200k works in the browser.

What formats can I download?

CSV, Excel (.xlsx), JSON, and SQL (a CREATE TABLE plus INSERT statements). Pick whatever fits your workflow.

Will I get the same file every time?

Yes. This page uses the fixed seed mongo-demo, so the download is byte-identical on every machine. Clear the seed in the generator for fresh random data.

Can I get separate tables, messy data, or other formats?

Yes. Use Tables → Excel/SQL for a normalized multi-table export, switch on Messy / dirty data in Advanced options for nulls, typos and inconsistent dates, and choose CSV, Excel, JSON or SQL on any download.

Is the data real?

No — it is 100% synthetic, generated in your browser, with no real people or companies. Free to use commercially.