Data Drift vs Concept Drift

The two failure modes you must monitor — and they have different fixes.

0/2 done

Inputs change vs world changes

Two drifts, two responses

Data drift (covariate shift) — the distribution of inputs changes; the relationship P(y|x) is unchanged. Example: more users from a new country. Fix: retrain on fresh data.
Concept drift — the relationship changes; same x, different y. Example: post-COVID, the same browsing pattern no longer predicts the same purchase. Fix: rethink features, often re-design the model.

Common detectors

Detector	What it measures	Good for
PSI (Population Stability Index)	Distributional change per feature	Tabular, monthly
KS test	Difference between two empirical distributions	Numeric features
Chi-square	Categorical distribution change	Categorical features
Model performance	Direct outcome metric vs labels	When labels arrive fast enough

Always pair distribution metrics with outcome metrics — drift without performance loss is sometimes irrelevant; performance loss without drift is the interesting problem.

Analogy

Data drift is a new neighbourhood moving in — the customers look different, but their preferences still match. Concept drift is the same neighbourhood changing its preferences — looks identical, behaves differently. Same surveys, different answers needed.

Reading in progress · 0 of 2 activities done