Sample data for a portfolio project

A realistic, multi-dimensional dataset with genuine trends to anchor a standout portfolio project — clean, explain it, and visualize the findings.

E-commerceSeeded - reproducibleCSV / Excel / JSON / SQL100% in-browser

Generate & download

Save / load scenario (stored only in this browser)

About this dataset

This is a free, reproducible E-commerce dataset you can generate and download right here as CSV, Excel, JSON or SQL. It is built for portfolio projects, end-to-end analysis and visualization — and because every field is correlated rather than random, the numbers actually hold together when you analyze them.

Shoppers carry RFM-style segments that set purchase frequency and behavior, and demand flows through a seasonality curve with weekend lifts and a Q4 holiday peak — so RFM, cohort, attribution, and seasonality analysis all return meaningful results.

Columns in this dataset

Schema for the E-commerce export (the anomaly column appears only when labels are switched on):

ColumnTypeDescription
order_datedateDay the order was placed.
order_idintegerOrder id; lines in an order share it.
customer_id / customerint / textThe shopper.
segmenttextRFM-style: Champion, Loyal, Regular, New, At-Risk.
channeltextAcquisition channel.
product / categorytextSKU and department.
quantity / unit_pricenumberUnits and list price.
discount_pctnumberPromo or loyalty discount.
line_totalnumberquantity x price x (1 - discount).
returned0/1Whether the line was returned.
anomaly0/1Present with labels on; flags fraud-like orders.

Load it with pandas

import pandas as pd
df = pd.read_csv("ecommerce_orders.csv")
df.head()

Good for

Portfolio projectsEnd-to-end analysisVisualizationStorytelling with data

Related sample datasets

FAQ

How big is this dataset?

Around 10,000 rows by default. Change the row count in the generator above and re-export — anything up to ~200k works in the browser.

What formats can I download?

CSV, Excel (.xlsx), JSON, and SQL (a CREATE TABLE plus INSERT statements). Pick whatever fits your workflow.

Will I get the same file every time?

Yes. This page uses the fixed seed portfolio-demo, so the download is byte-identical on every machine. Clear the seed in the generator for fresh random data.

Can I get separate tables, messy data, or other formats?

Yes. Use Tables → Excel/SQL for a normalized multi-table export, switch on Messy / dirty data in Advanced options for nulls, typos and inconsistent dates, and choose CSV, Excel, JSON or SQL on any download.

Is the data real?

No — it is 100% synthetic, generated in your browser, with no real people or companies. Free to use commercially.