Log Segments, Retention & Compaction

How Kafka decides what to keep, what to delete, and what to merge.

0/3 done

Two policies, one storage engine

Segments, retention, compaction

A partition on disk is a sequence of log segments (.log + .index files). Two cleanup policies decide their fate:

  • delete (default) — drop segments older than retention.ms or above retention.bytes.
  • compact — keep only the latest value per key; null values become tombstones and are eventually GC'd. Turns the topic into a changelog of state.

Compacted topics are how Kafka itself stores consumer offsets (__consumer_offsets) and Streams' state-store changelogs. Use them whenever the topic represents state, not events.

Configure a compacted topic for user-profile state

Write the kafka-topics config block that creates a 12-partition, RF=3, compacted topic for user profiles, with a 7-day delete-after-tombstone window.

Reading in progress · 0 of 3 activities done