Idempotency + partition-ordered, key-sequential
Design for redelivery
Outside of full EOS, Kafka delivers at-least-once: after a crash or rebalance, the last uncommitted records are redelivered. Therefore every consumer side-effect must be idempotent — running it twice must equal running it once. Three standard techniques:
- Natural idempotency —
UPSERTkeyed by the entity id; setting a value is inherently repeatable (unlikebalance += x). - Dedup key / processed-ids table — record each handled message id (or topic-partition-offset) in the same transaction as the side-effect; skip if seen.
- Idempotency tokens propagated to downstream APIs so they dedupe.
Ordering at scale
Ordering holds within a partition, so co-partition by the entity whose causal order matters. But beware: scaling consumers, retry topics, and async handlers can all re-order. If you process records from one partition concurrently across threads, you have thrown ordering away. The pattern is key-level sequential, partition-level parallel: hash the key to a worker so all records for one key go to one thread in order, while different keys run in parallel.