E-commerce Order Data Generator

Realistic online-store order lines with RFM-style customer segments (Champion → At-Risk), weekly and holiday seasonality, promotions, and lifelike return rates. A ready sample orders file for dashboards, RFM and cohort practice.

SeededCSV + ExcelFraud labels100% in-browser

Generate the dataset

Save / load scenario (stored only in this browser)

Quick-start presets

What's in this dataset

Each row is one order line (one product within an order). Orders span the last 12 months.

ColumnTypeDescription
order_datedateDay the order was placed.
order_idintegerOrder identifier; lines in one order share it.
customer_id / customerint / textThe shopper.
segmenttextRFM-style: Champion, Loyal, Regular, New, At-Risk.
channeltextAcquisition channel (Organic, Paid Search, Email, …).
product / categorytextSKU and its department (Apparel, Electronics, …).
quantity / unit_pricenumberUnits and list price.
discount_pctnumberPromo or loyalty discount applied.
line_totalnumberquantity × price × (1 − discount).
returned0/1Whether the line was returned (higher for At-Risk).
anomaly0/1Present only with injection on; flags fraud-like orders.

Why it's realistic

Customers are assigned RFM-style segments that set how often they buy and how much: Champions and Loyal shoppers order frequently and respond to loyalty discounts, while At-Risk customers buy rarely and return more. Demand flows through a seasonality curve — a weekend lift, a strong November/December holiday peak, and a January lull — so daily revenue has the shape analysts expect to see. Promotions apply discounts in bursts, and returns track segment behavior. The outcome is an orders table where RFM analysis, cohort retention, channel attribution, and seasonality decomposition all return meaningful, defensible results instead of noise.

Good for

RFM segmentation Seasonality / time-series demos Cohort & retention analysis Power BI / Tableau / Looker dashboards Fraud / anomaly detection SQL & pandas practice

FAQ

How many customers are there?

About one customer per 8 rows, so a 5,000-row file has ~600 shoppers with repeat orders distributed by segment.

Does the data show seasonality?

Yes — generate ~12 months and the daily-revenue chart shows weekend lifts and a clear Q4 holiday peak. Use a fixed seed to reproduce the same curve.

Is anything uploaded?

No — generation is 100% in your browser.