Retail POS Basket Data Generator — market-basket transactions (CSV & Excel)

Example output — a peek before you generate

datetime	store_id	product	department	quantity	unit_price	line_total
2026-06-30 12:14	S1	Tortilla Chips	Snacks	2	$3.49	$6.98
2026-06-30 12:14	S1	Salsa	Snacks	1	$3.99	$3.99
2026-06-30 12:14	S1	Soda 12pk	Snacks	1	$6.99	$6.99
2026-06-30 17:02	S2	Spaghetti	Pasta Night	2	$1.99	$3.98

Notice the chips + salsa + soda basket — that's a real product affinity, not chance. A real file has up to 200,000 rows; set your row count below and click Generate for your own.

Generate the dataset

Number of rows (basket lines, approx.)

Seed (same seed → identical data; blank = random)

Inject anomalies / labels (flags suspicious transactions in an anomaly column)

Save / load scenario (stored only in this browser)

Quick-start presets

Preview

First rows

First 25 rows shown; downloads contain the full dataset. Group by transaction_id to recover baskets.

What's in this dataset

Each row is one item within a basket. Group by transaction_id to reconstruct each shopper's basket for association-rule mining.

Column	Type	Description
transaction_id	integer	The basket; multiple item rows share it.
datetime	datetime	Timestamp with realistic hour-of-day weighting.
store_id	text	Which store rang the sale (scales with row count).
product / department	text	The item and its aisle/department.
quantity / unit_price	number	Units and shelf price.
line_total	number	quantity × unit_price.
payment	text	Card / Cash / Mobile.
anomaly	0/1	Present only with injection on; flags suspicious transactions (e.g. odd-hour high-value bulk).

Why it's realistic

The catalog is organized into affinity groups — sets of items that really go together, like {tortilla chips, salsa, guacamole, soda} or {diapers, wipes, baby food}. Each basket draws one or two of these groups and co-purchases their members at high probability, with the occasional impulse buy mixed in. That means an association-rule miner (Apriori/FP-Growth) will actually surface lift between linked products — the whole point of a market-basket exercise — instead of finding nothing because the items were independent. Layer on weighted shopping hours, weekend/holiday traffic lifts, and multiple stores, and you get transaction data that behaves like a real grocery POS feed.

Good for

Market-basket analysis Association rules (Apriori / FP-Growth) Recommendation demos Store / hourly sales dashboards Fraud / anomaly detection SQL window-function practice

FAQ

How do I run market-basket analysis on this?

Group rows by transaction_id to form item lists, then feed them to Apriori or FP-Growth. You should see strong lift within the affinity groups baked into the catalog.

How many stores are there?

It scales with size — from one store on small files up to eight on large ones — so you can compare store performance.

Is anything uploaded?

No — generation is 100% in your browser.