Overview
Relational Modeling — 1NF, 2NF, 3NF
Why normalisation, and where to stop denormalising.
Why it matters
Normalisation removes update anomalies; denormalisation buys read speed. The trick is knowing which workload you're optimising for.
Going deeper
A working mental model for the three normal forms:
- 1NF — atomic values, no repeating groups. A
phone_numberscolumn holding'+34 600...; +44 207...'violates 1NF; split into rows. - 2NF — every non-key column depends on the whole primary key. Only matters for composite keys: if
(order_id, line_no)is the key butcustomer_nameonly depends onorder_id, move it toorders. - 3NF — no transitive dependencies. If
zip_codedecidescity, don't storecityon every customer row — put it in azip_codeslookup.
OLTP systems mostly aim at 3NF (correctness wins). Analytical / warehouse systems mostly denormalise (read speed wins) using star schemas — a fact table joined to a handful of dimension tables, with deliberate redundancy in the dimensions.