Subject Access & Erasure Pipelines

SAR / DSAR / 'right to be forgotten' — these are an engineering pipeline, not a legal process.

0/2 done

Overview

Subject Access & Erasure Pipelines

SAR / DSAR / 'right to be forgotten' — these are an engineering pipeline, not a legal process.

Why it matters

When the request arrives, you need to find every dataset that contains the subject, extract or delete in N days, prove it, and not break statistical aggregates downstream.

Going deeper

Building a scalable erasure (Right to Be Forgotten) pipeline:

  1. Identity Resolution: Resolving the requestor's email to their canonical identity (e.g. user_123).
  2. Orchestration: Firing a webhook to every internal system (CRM, Warehouse, Microservices).
  3. Nullification vs Deletion: In a warehouse, deleting rows can break historical aggregate reporting (e.g. Total Revenue dips). Instead of a SQL DELETE, best practice is to UPDATE table SET name=NULL, email=NULL WHERE user_id=123. The financial aggregate remains, but the PII is irretrievably erased.
  4. Receipt Generation: Emitting an audit log (stored as long as the law requires) proving the deletion executed.

Analogy

A DSAR Pipeline is like a restaurant recalling an ingredient.

Imagine a health inspector tells a restaurant: 'Batch 402 of your flour is contaminated, remove it all.' If the kitchen just threw flour randomly into pots and bins without labeling them, they have to throw out all the food in the building. But if they have strict tracking (lineage), they know exactly which cakes, which doughs, and which bread loaves contain Batch 402. A Data Subject Access Request (DSAR) or erasure request is the same. The user says 'Delete me.' If you have a data catalog with lineage, your automated pipeline queries the tracking system, finds exactly the 14 tables holding their ID, and deletes the rows. Without it, you are throwing away data guessing where they are.

Make it stick

Use the prompts below to anchor subject access & erasure pipelines to something you actually own.

  • If a customer emails you today asking to delete all their data, how many JIRA tickets, Slack messages, and manual DB scripts does it currently take?
  • Identify an aggregate metric (e.g., Daily Sales) that would fundamentally break if you ran a hard SQL DELETE for a prolific customer.
  • How does your organization handle DSARs in SaaS vendor systems (like Zendesk or Salesforce)? Are they connected to your internal deletion pipeline?

Reading in progress · 0 of 2 activities done