Find which products are bought together using Apriori and association rules. Uses a retail POS dataset built with real basket affinities, so you'll actually see meaningful lift. About 30 minutes.
Generic random transactions are useless here — if items are independent, no rules emerge. This dataset is built from affinity groups (chips + salsa + soda, diapers + wipes + baby food…), so association-rule mining returns genuine results.
Download market-basket dataset (CSV) → Customize in the generator
Each row is one item in a basket; group by transaction_id to reconstruct baskets.
pip install pandas mlxtend
import pandas as pd
df = pd.read_csv("retail_pos.csv")
df.head()
Turn the long table into one row per transaction, one column per product, with 1/0 for presence:
basket = (df.groupby(["transaction_id", "product"])["quantity"]
.sum().unstack().fillna(0))
basket = (basket > 0).astype(int)
basket.shape
from mlxtend.frequent_patterns import apriori, association_rules
itemsets = apriori(basket, min_support=0.02, use_colnames=True)
itemsets.sort_values("support", ascending=False).head(10)
rules = association_rules(itemsets, metric="lift", min_threshold=1.0)
rules = rules.sort_values("lift", ascending=False)
rules[["antecedents", "consequents", "support", "confidence", "lift"]].head(10)
Support = how often the combo appears. Confidence = P(consequent | antecedent). Lift > 1 = the items co-occur more than chance — a real association. You should see the seeded affinity groups rise to the top (e.g. salsa → tortilla chips with high lift). Use these for cross-sell, store layout, or recommendation demos.
Because baskets are assembled from real affinity groups with the occasional impulse buy, the joint distribution has genuine structure. Apriori and FP-Growth surface lift you can defend — unlike random transaction generators where every rule has lift ≈ 1 and the exercise falls flat.