1. Topics and partitions

Topic

  • Similar to a table in a database, a topic is a category or feed name to which messages are published by producers and from which messages are consumed by consumers.
  • It serves as a way to organize and categorize the messages within the Kafka cluster
  • A particular stream of data
  • You cannot query topics, instead, you use kafka producers to send data and kafka consumers to read data

Screenshot from 2024-01-09 17-40-04.png

Partition

Each topic can be divided into multiple partitions, and each partition is an ordered, immutable sequence of messages. Partitioning helps in paralleling processing and distributing the load across multiple consumers.

Screenshot from 2024-01-09 17-44-13.png

  • Once data is written to a partition, it cannot be changed inmutability
  • Data ts kept only for a limited time (default is one week, configurable)
  • Data is assigned to a random partition unless a key is provided