Overview
Anonymisation Techniques
Masking, generalisation, k-anonymity, differential privacy — none are silver bullets.
Why it matters
Each technique trades a different axis of utility for privacy. DP gives a tunable, provable bound — at a real utility cost.
Going deeper
A rough decision table for the four techniques:
| Technique | Best for | Breaks when |
|---|---|---|
| Masking / tokenisation | Operational data still keyed by id | The token vault leaks |
| Generalisation | Lookup-style analytics; coarse dashboards | Joined with a richer external dataset |
| k-anonymity (+ l-diversity) | Microdata release | Sensitive attribute is homogeneous in a cell |
| Differential privacy | Public statistics, query interfaces | ε chosen too loosely — or queries are unbounded so the privacy budget burns out |
In production you typically layer these: tokenise direct identifiers, generalise the quasi-identifiers, and (for any public release or wide-audience dataset) wrap aggregate metrics in a DP query interface with an enforced ε budget per analyst.