What is a Kafka producer?
A Kafka producer is a client application that publishes (writes) records (messages) to Kafka topics. It's the entry point for data into the Kafka ecosystem, responsible for sending data efficiently and reliably to Kafka brokers.
Core Functionality
The primary role of a Kafka producer is to create and send messages, known as records, to a specified Kafka topic. Each record consists of a key (optional), a value (the actual message data), and a timestamp. Producers are designed to handle large volumes of data and ensure its delivery to Kafka clusters.
Key Concepts
- Topics: Producers send records to specific topics, which are categories or feeds to which records are published.
- Partitions: Within a topic, records are further organized into partitions. Producers determine which partition a record goes into, often based on the record's key (ensuring messages with the same key go to the same partition) or using a round-robin strategy.
- Serialization: Before sending, the producer serializes the record's key and value into bytes, as Kafka stores data as byte arrays.
- Acknowledgements (acks): Producers can configure their acknowledgement settings (e.g.,
acks=0,acks=1,acks=all) to control the durability guarantee required from the broker before considering a send successful. - Buffering: Producers buffer records in memory and send them in batches to improve efficiency and throughput.
How Producers Send Messages
When a producer sends a message, it first serializes the key and value. Then, it determines the target partition (either explicitly, based on the key using a default or custom partitioner, or round-robin). The producer then sends the batched messages to the leader replica of that partition. Based on the acks configuration, it waits for a response from the broker indicating successful receipt.
Example (Conceptual Java Code)
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, String> producer = new KafkaProducer<>(props);
for (int i = 0; i < 100; i++) {
producer.send(new ProducerRecord<>("my-topic", Integer.toString(i), "message " + i));
}
producer.close();
Important Producer Configurations
bootstrap.servers: A list of host/port pairs to use for establishing the initial connection to the Kafka cluster.key.serializer/value.serializer: Classes used to serialize the key and value of the record before sending them to the broker.acks: The number of acknowledgments the producer requires the leader to have received before considering a request complete.0(no wait),1(leader acknowledged),all(all in-sync replicas acknowledged).retries: The number of times the producer will retry sending a message if it fails.enable.idempotence: When set totrue, ensures that messages are written to Kafka exactly once, preventing duplicates even if retries occur.