Understanding Apache Kafka

Apache Kafka is a production grade, high performing, scalable, fault-tolerant messaging platform that enables the following three features:

  • Publishing and subscribing streams of records similar to a message queue
  • Storing streams of records in a fault tolerant, durable way
  • Processing streams of records

Apache Kafka consists of the following key concepts:

  • Kafka is run on clusters, which span over multiple servers across different data centers
  • Kafka cluster stores stream of records categorized into topics
  • Each record consists of a key, a value, and a timestamp

A topic in Apache Kafka is the core abstraction of the stream of records. It is multi-subscriber, meaning it can have zero, one, or many subscribers listening for data written to it. Each topic is represented as a partitioned immutable commit log, which is appended and has a unique ID number known as the offset, used to uniquely identify each record in the partition.

Apache Kafka can be used both as a message queue and a publish-subscribe using the same topic concept. When a topic is used as a message queue, even when multiple consumers listen to that topic, only one would be able to consume a message, wherein publish-subscribe models messages can be broadcast to multiple consumers. 

In this application, however, we will be using an Apache Kafka topic as a message queue. Also, the Spring Kafka library will be used to enable easy communication with Apache Kafka. Spring Kafka provides templates and listeners to both produce to and consume from Apache Kafka topics easily.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.34.161