Skip to content

module balance

tpalino edited this page Apr 28, 2016 · 4 revisions

Module: Balance

Kafka itself does not have a means of moving partitions around to balance the load on each broker in the cluster. The underlying tool to move partitions, kafka-reassign-partitions.sh, is provided, but needs to be given explicit direction as to what it should do. The kafka-assigner tool does this by providing several types of balance algorithms that can be used, either alone or chained together, to move partitions around to even out the load on each broker. This is most commonly used when expanding the number of brokers in a cluster, or when bringing back a broker that was down for a hardware issue.

The types of balance that can be performed are:

  • count - Evens out the number of partitions on each broker. The smallest partitions in the cluster are moved to accomplish this
  • size - Evens out the total size of partitions on disk for each broker. The largest partitions in the cluster are moved to accomplish this
  • even - This is a specialized type for clusters which have topics that have a number of partitions that are a multiple of the number of brokers. The module assures that every broker has exactly the same number of partitions for each topic.
  • leader - Reorders the replica list to provide ideal leader balance on the brokers. Exactly the same as running the reorder module alone

The types can be chained together so that they can be performed using a single set of partition moves, which means you don't have to run the command multiple times. For example, the most common balance we perform is to balance a cluster by count, and then balance it by size. In this way, we can get approximately the same number of partitions on each broker, and the same disk usage (and therefore produce throughput).

Options

The following are the options for this module:

Option Required Argument Default Description
--types yes list of strings The balance types to perform, in order (space-separated)
--size no none Show the sizes for all partitions
--datadir no path /tmp/kafka-logs The path to the Kafka data directory (log segments) on the brokers

Example

All examples assume the cluster Zookeeper connect string is zook.example.com:2181/kafka/clustername

Example 1: Balance all partitions in the cluster by count and by size in a single step

kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types count size

Example 2: Balance partitions in the cluster only by size

kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types size

Example 3: Balance partitions in the cluster by count and perform a leader election with replica reordering

kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types size leader

Notes

Balance commands can take a very long time to run, depending on the number of partitions and amount of data to move around. It's suggested to run it under screen (or similar) to prevent a disconnected terminal from causing a problem.

Note that the balance modules try and balance leadership as well, so normally adding a leader type is not needed.

Clone this wiki locally