-
Notifications
You must be signed in to change notification settings - Fork 145
module balance
Kafka itself does not have a means of moving partitions around to balance the load on each broker in the cluster. The underlying tool to move partitions, kafka-reassign-partitions.sh, is provided, but needs to be given explicit direction as to what it should do. The kafka-assigner tool does this by providing several types of balance algorithms that can be used, either alone or chained together, to move partitions around to even out the load on each broker. This is most commonly used when expanding the number of brokers in a cluster, or when bringing back a broker that was down for a hardware issue.
The types of balance that can be performed are:
- count - Evens out the number of partitions on each broker. The smallest partitions in the cluster are moved to accomplish this
- size - Evens out the total size of partitions on disk for each broker. The largest partitions in the cluster are moved to accomplish this
- even - This is a specialized type for clusters which have topics that have a number of partitions that are a multiple of the number of brokers. The module assures that every broker has exactly the same number of partitions for each topic.
- leader - Reorders the replica list to provide ideal leader balance on the brokers. Exactly the same as running the reorder module alone
The types can be chained together so that they can be performed using a single set of partition moves, which means you don't have to run the command multiple times. For example, the most common balance we perform is to balance a cluster by count, and then balance it by size. In this way, we can get approximately the same number of partitions on each broker, and the same disk usage (and therefore produce throughput).
The following are the options for this module:
Option | Required | Argument | Default | Description |
---|---|---|---|---|
--types | yes | list of strings | The balance types to perform, in order (space-separated) | |
--size | no | none | Show the sizes for all partitions | |
--datadir | no | path | /tmp/kafka-logs | The path to the Kafka data directory (log segments) on the brokers |
All examples assume the cluster Zookeeper connect string is zook.example.com:2181/kafka/clustername
Example 1: Balance all partitions in the cluster by count and by size in a single step
kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types count size
Example 2: Balance partitions in the cluster only by size
kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types size
Example 3: Balance partitions in the cluster by count and perform a leader election with replica reordering
kafka-assigner.py -z zook.example.com:2181/kafka/clustername -e balance --types size leader
Balance commands can take a very long time to run, depending on the number of partitions and amount of data to move around. It's suggested to run it under screen (or similar) to prevent a disconnected terminal from causing a problem.
Note that the balance modules try and balance leadership as well, so normally adding a leader
type is not needed.