Understanding Apache Kafka

Apache Kafka is a production grade, high performing, scalable, fault-tolerant messaging platform that enables the following three features:

Publishing and subscribing streams of records similar to a message queue
Storing streams of records in a fault tolerant, durable way
Processing streams of records

Apache Kafka consists of the following key concepts:

Kafka is run on clusters, which span over multiple servers across different data centers
Kafka cluster stores stream of records categorized into topics
Each record consists of a key, a value, and a timestamp

A topic in Apache Kafka is the core abstraction of the stream of records. It is multi-subscriber, meaning it can have zero, one, or many subscribers listening for data written to it. Each topic is represented as a partitioned immutable commit log, which is appended and has a unique ID number known as the offset, used to uniquely identify each record in the partition.

Apache Kafka can be used both as a message queue and a publish-subscribe using the same topic concept. When a topic is used as a message queue, even when multiple consumers listen to that topic, only one would be able to consume a message, wherein publish-subscribe models messages can be broadcast to multiple consumers.

In this application, however, we will be using an Apache Kafka topic as a message queue. Also, the Spring Kafka library will be used to enable easy communication with Apache Kafka. Spring Kafka provides templates and listeners to both produce to and consume from Apache Kafka topics easily.

Table of Contents for Understanding Apache Kafka

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding Apache Kafka