Sample data with missing values

A realistic SaaS dataset with missing values scattered through it (plus typos and duplicates) — ideal for practicing imputation, validation and missing-data strategies.

SaaS / MRRSeeded - reproducibleCSV / Excel / JSON / SQL100% in-browser

Generate & download

Save / load scenario (stored only in this browser)

About this dataset

This is a free, reproducible SaaS / MRR dataset you can generate and download right here as CSV, Excel, JSON or SQL. It is built for missing-value imputation, data validation and cleaning pipelines — and because every field is correlated rather than random, the numbers actually hold together when you analyze them.

Accounts sign up across the last two years on weighted plan tiers, then face a plan-dependent churn hazard with chances of expansion and contraction each month — reproducing the real shape of a SaaS book (leaky low end, sticky enterprise) so retention math has signal.

Columns in this dataset

Schema for the SaaS / MRR export (the anomaly column appears only when labels are switched on):

ColumnTypeDescription
monthdateFirst of the month the movement occurred.
account_id / accountint / textThe subscribing company.
movementtextnew, expansion, contraction, or churn.
plantextStarter / Pro / Business / Enterprise.
seatsintegerActive seats after the movement (0 on churn).
mrrnumberAccount MRR after the movement.
mrr_deltanumberChange in MRR (negative for contraction/churn).
region / industrytextFirmographic dimensions.
anomaly0/1Present with labels on; flags suspicious churn.

Load it with pandas

import pandas as pd
df = pd.read_csv("saas_mrr.csv")
df.head()

Good for

Missing-value imputationData validationCleaning pipelinesTeaching

Related sample datasets

FAQ

How big is this dataset?

Around 8,000 rows by default. Change the row count in the generator above and re-export — anything up to ~200k works in the browser.

What formats can I download?

CSV, Excel (.xlsx), JSON, and SQL (a CREATE TABLE plus INSERT statements). Pick whatever fits your workflow.

Will I get the same file every time?

Yes. This page uses the fixed seed missing-demo, so the download is byte-identical on every machine. Clear the seed in the generator for fresh random data.

Can I get separate tables, messy data, or other formats?

Yes. Use Tables → Excel/SQL for a normalized multi-table export, switch on Messy / dirty data in Advanced options for nulls, typos and inconsistent dates, and choose CSV, Excel, JSON or SQL on any download.

Is the data real?

No — it is 100% synthetic, generated in your browser, with no real people or companies. Free to use commercially.