Six Dimensions of Data Quality

Accuracy, completeness, consistency, timeliness, uniqueness, validity.

0/3 done

The six dimensions

The six axes you actually score

  1. Accuracy — does the value reflect reality?
  2. Completeness — are required fields present?
  3. Consistency — does the same fact agree across systems?
  4. Timeliness — is the value fresh enough for its use?
  5. Uniqueness — exactly one row per real-world entity?
  6. Validity — does the value satisfy its declared format / range?

A practical data-quality programme scores datasets on each axis, sets a target per axis per dataset (perfection is uneconomic), and instruments alerts on drift.

Vitals on a chart

Think of a hospital chart:

  • Accuracy = the right patient's vitals were entered.
  • Completeness = no blank allergy field.
  • Consistency = the same blood-pressure reading on the chart and the monitor.
  • Timeliness = today's vitals, not last week's.
  • Uniqueness = one chart per patient, not three with name variants.
  • Validity = the temperature is plausible (35–42 °C), not 410.

Bad scores on any one axis can kill a patient — same idea for a business decision.

Score one of your datasets

Pick a dataset you own and score it on the six dimensions.

  • Which dimension is your lowest score — and is that the one your consumers are *actually* paying for?
  • Where is a 'perfect' DQ target costing more than it saves (e.g. 100% accuracy on a field nobody reads)?
  • What's the smallest contract you could publish next sprint that names a target per dimension for the top-3 most-used columns?

Reading in progress · 0 of 3 activities done