Flink and Exactly-Once — Stateful Stream Processing

Why stateful streaming is hard, how Flink's checkpoints make it tractable.

0/1 done

Theory

Exactly-once is a system property, not a flag

Apache Flink's claim to fame is end-to-end exactly-once processing of streams — even with stateful operators (windowed aggregations, joins, dedup) and even across failures.

The trick is a coordinated distributed snapshot (the Chandy-Lamport algorithm): periodically, a 'checkpoint barrier' flows through the DAG; every operator snapshots its state to durable storage when the barrier arrives. On failure, Flink restores the last completed checkpoint and replays the Kafka offsets from that snapshot — every input is processed exactly once into state.

For end-to-end exactly-once including the sink, the sink must participate in a 2-phase commit (e.g. Kafka transactional producer). Without that, you only have at-least-once delivery to the sink, no matter how strong Flink's internals are.

Analogy

Flink's checkpointing is a group photo taken with a moving marching band. You can't yell 'freeze!' to the whole band at once, so instead a marker is passed down each row of players; the instant a player receives it, they note their exact pose, then pass it on. Stitch all those individual poses together and you get one consistent snapshot of the entire band mid-march — without ever stopping the music. If someone trips later, everyone rewinds to that last good photo and replays from there, so every note is played exactly once. The catch: the sink (the recording) must also be able to rewind, or you only get at-least-once.

How Flink achieves exactly-once

Click a node to focus its neighbourhood · drag to pan · scroll to zoom

A checkpoint barrier crossing the DAG

The JobManager emits a barrier into every source. As each operator sees the barrier, it snapshots its state to durable storage (S3 / RocksDB-backed) and forwards the barrier. When all sinks acknowledge, the checkpoint is complete and the source's Kafka offsets advance.

Reflect

Exactly-once is genuinely possible — and genuinely expensive. It needs durable state storage, a transactional sink, and operational maturity for checkpoint-restore drills. For most use cases, at-least-once + idempotent sink is the sweet spot and an order of magnitude simpler.

  • Which of your streaming jobs *truly* needs exactly-once vs at-least-once + idempotent sink?
  • Have you ever restored from a Flink checkpoint in anger? If not, run a quarterly drill.

Reading in progress · 0 of 1 activity done