Apache Kafka is a high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is in essence a massively scalable pub/sub message queue designed as a distributed transaction log. It can be used to process streams of data in real-time, building up a commit log of changes.