Data Quality (KA 11) — The Six Dimensions

Accuracy, completeness, consistency, timeliness, uniqueness, validity — measure, target, monitor, escalate.

0/3 done

Overview

Data Quality (KA 11) — The Six Dimensions

Accuracy, completeness, consistency, timeliness, uniqueness, validity — measure, target, monitor, escalate.

Why it matters

DQ is the KA most often equated with ‘a dashboard’ and most rarely run as a programme. DMBOK organises it around measure → target → monitor → escalate, the same loop SRE uses for service reliability.

Going deeper

Six DQ dimensions, each with a typical SQL-able rule:

DimensionDefinitionSample rule
AccuracyDoes the value reflect reality?Cross-check against system of record
CompletenessAre required fields present?COUNT(*) WHERE col IS NULL = 0
ConsistencySame fact agrees across systems?SUM(a.value) = SUM(b.value)
TimelinessFresh enough for use?max(updated_at) > now() - interval '1 hour'
UniquenessOne row per real-world entity?COUNT(*) = COUNT(DISTINCT key)
ValidityDoes it satisfy declared format/range?regex / CHECK constraints

Pick a target per dimension per dataset (perfection is uneconomic), instrument, and route the alerts to the right human. Without that last step the dashboard is decorative.

Analogy

Data quality is the six vital signs on a hospital chart.

Pulse, blood pressure, respiration, temperature, oxygen, level of consciousness — any one out of band escalates differently. Nobody averages vital signs into a single score, because the response depends on which sign is failing.

DQ is the same. ‘Overall data quality is 87 %’ is meaningless; ‘completeness on customer_email is 60 %, uniqueness on order_id is 99.9 %’ tells you what to fix. Always score and alert per dimension.

Make it stick

Anchor data quality (ka 11) — the six dimensions to something you actually own.

  • Where in your platform does *data quality (ka 11) — the six dimensions* live today — and who owns it?
  • What is the smallest version of *data quality (ka 11) — the six dimensions* you could ship next sprint?
  • What's the most likely misuse of *data quality (ka 11) — the six dimensions*, and how would you spot it in a design review?

Reading in progress · 0 of 3 activities done