Capacity Planning & Partition Sizing

Pick a partition count you won't regret in 18 months.

0/3 done

Throughput, parallelism, and the cost of too many

The numbers behind 'how many partitions?'

Partition count is the hardest-to-change decision in Kafka: you can add partitions, but doing so re-shuffles key→partition assignment, breaking per-key ordering for every consumer that depends on it. So size it deliberately.

Throughput floor. Measure (or estimate) per-partition throughput t for your workload (often 10–50 MB/s produce, less for consume with heavy processing). For a target T, you need at least partitions >= T / t. Then take the max of the produce-side and consume-side requirement — consumers can't parallelise beyond the partition count, so a slow handler often dominates.

Consumer parallelism ceiling. A consumer group can have at most one consumer per partition. If you might need 24 parallel consumers at peak, you need >= 24 partitions now. Plan for the 18-month peak, not today's.

Costs of over-partitioning. Each partition is open files + memory + a replication stream + longer leader-election and recovery times. Tens of thousands of partitions per broker is a real ceiling (much higher under KRaft than ZooKeeper, but not infinite). Rule of thumb: right-size, leave 2–3x headroom, and prefer a few well-sized topics over thousands of tiny ones.

A partition-sizing calculator (Python)

Encode the sizing rule: given target produce/consume MB/s, measured per-partition rates, and required peak consumer parallelism, compute the partition count with headroom. Treat it as the back-of-envelope you run before every new topic.

Reflect

Pick a topic you provisioned over a year ago.

  • Is its partition count the produce-bound, consume-bound, or parallelism-bound number?
  • If traffic 5x'd next quarter, could you scale consumers without re-partitioning?
  • Where are you paying for tens of thousands of near-empty partitions you could consolidate?

Reading in progress · 0 of 3 activities done