Learning Kafka - Configuring Kafka Producer for High Throughput
The following post covers the common configuration parameters in Kafka Producer to improve throughput via compression and batching
Contents
Batching
Kafka Producer allows to batch messages before sending them to the Broker. This helps to reduce the network overhead incurred per message and improves throughput. The messages are batched per Topic-Partition before being sent out to the respective Broker.
There are two key configuration parameters to control batching in Kafka
-
linger.ms
Defines the number of milliseconds the Kafka Producer is to wait before sending out a batch (for a Topic-Partition). When set to
0
, the messages are sent right away without batching -
batch.size
Defines the maximum size of a batch (per Topic-Partition). When the Kafka Producer hits this threshold, it sends out messages irrespective of whether
linger.ms
is satisfied or not. Likewise, if a message is greater than thebatch.size
, it is sent out immediately
Configuring Kafka Producer for Batching
const producer = new Kafka.Producer({
// ...Producer Configuration
// Configure waiting time before sending out a batch of messages
'linger.ms': 200,
});
librdkafka
has configuration parameters to determine the maximum number of messages in the Producer Queue (queue.buffering.max.messages
) and the maximum size of the messages in the Producer queue (queue.buffering.max.kbytes
) before it throws QUEUE FULL
error.
I was unable to clarify from the documentation if these parameters provide the same functionality as batch.size
which is caps size on a Topic-Partition level and when the threshold is hit, it sends out the messages instead of throwing QUEUE FULL
error
Compression
Kafka Producer can be configured to use a compression type before sending out the messages. Enabling compression on the producer does not require any changes on the Broker.
A smaller message batch size improves throughput on the Kafka Producer and disk utilization on the Kafka Broker.
Configuring Kafka Producer for Compression
const producer = new Kafka.Producer({
// ...Producer Configuration
// Specify the compression type to be used for messages
'compression.codec': 'snappy'
});
It is recommended to benchmark with different compression types and batching strategies to decide on the optimal for throughput