Relational Modeling — 1NF, 2NF, 3NF

Why normalisation, and where to stop denormalising.

0/2 done

Overview

Relational Modeling — 1NF, 2NF, 3NF

Why normalisation, and where to stop denormalising.

Why it matters

Normalisation removes update anomalies; denormalisation buys read speed. The trick is knowing which workload you're optimising for.

Going deeper

A working mental model for the three normal forms:

  • 1NF — atomic values, no repeating groups. A phone_numbers column holding '+34 600...; +44 207...' violates 1NF; split into rows.
  • 2NF — every non-key column depends on the whole primary key. Only matters for composite keys: if (order_id, line_no) is the key but customer_name only depends on order_id, move it to orders.
  • 3NF — no transitive dependencies. If zip_code decides city, don't store city on every customer row — put it in a zip_codes lookup.

OLTP systems mostly aim at 3NF (correctness wins). Analytical / warehouse systems mostly denormalise (read speed wins) using star schemas — a fact table joined to a handful of dimension tables, with deliberate redundancy in the dimensions.

Analogy

Normalisation is keeping each fact in exactly one drawer.

Imagine a filing cabinet where the customer's phone number is written on every order, every invoice, every shipping label. The day the customer changes phone, you've got 47 documents to update — and you will miss some. The 'one fact, one drawer' rule means the phone lives in one customer card; everything else points to that card. Updates become a one-liner, contradictions become impossible.

Denormalisation is the opposite trade: photocopy the phone number onto the order so the cashier never has to walk to the customer cabinet. Faster lookup, slower / riskier updates. Every analytical warehouse table is a deliberate denormalisation.

Make it stick

Use the prompts below to anchor relational modeling — 1nf, 2nf, 3nf to something you actually own.

  • Pick a table you own that mixes OLTP and analytics queries. Where would splitting OLTP (normalised) from a warehouse mart (denormalised) make life easier?
  • Find a column duplicated across three tables in your schema. What update anomaly is waiting to happen?
  • Where in your stack is denormalisation *deliberate* (good) vs *accidental* (a future bug)?

Reading in progress · 0 of 2 activities done