Skip to main content
  1. Posts/

You Probably Don't Need Kafka

·1339 words·7 mins
Photograph By John Cameron
Blog Software Engineering System Design
Table of Contents

The Meeting
#

You’re in a technical discussion. Someone says “we should use a message queue for that.” A few heads nod. Someone else says “Kafka?” More nodding. The architect draws some boxes and arrows on a whiteboard. You nod too, because the boxes and arrows make sense, but you’re not entirely sure why the arrows need to go through a box labeled “Kafka” instead of just… calling the other service directly.

I’ve been that person. And after finally sitting down and figuring out what all of this actually means, I’m convinced that most of the time, the answer to “should we use Kafka?” is “probably not.”

Why Queues Exist
#

The simplest version: a queue lets two services talk to each other without being available at the same time.

Without a queue:

Service A calls Service B → B is down → A fails
Service A calls Service B → B is slow → A is slow
Service A calls Service B → A sends faster than B can handle → B crashes

With a queue:

Service A puts message on queue → A moves on immediately
                                → B picks it up when it's ready
                                → If B is down, message waits
                                → If B is slow, messages buffer

That’s it. That’s the core value proposition. Decoupling. Service A doesn’t need to know or care about Service B’s availability, speed, or even existence. The queue is the buffer between them.

Three Flavors of Messaging
#

This is where it gets confusing, because “message queue” is used loosely to describe three different patterns that solve different problems.

Point-to-Point (Task Queue)
#

One message, one consumer. You’re distributing work.

[Send Email Queue] → Worker 1 picks up email A
                   → Worker 2 picks up email B
                   → Worker 3 picks up email C

Each email gets sent exactly once. Add more workers to process faster. This is what most people actually need when they say “message queue.” It’s what BullMQ, Sidekiq, and Solid Queue do for background jobs.

Publish-Subscribe (Fan-Out)
#

One message, many consumers. You’re broadcasting events.

[User Signed Up] → Email service sends welcome email
                 → Analytics service tracks the event
                 → Notification service sends push notification

All three services get the same event. This is useful when multiple parts of your system need to react to the same thing, and you don’t want the publisher to know about all of them.

Event Streaming (The Log)
#

Like pub/sub, but messages are stored and replayable. Consumers track their own position.

[Order Events Log] → Consumer A: caught up to event #5000
                   → Consumer B: replaying from event #3000
                   → Consumer C: real-time at event #5000

This is Kafka’s world. Messages don’t disappear after consumption — they sit in an append-only log until the retention period expires. Any consumer can go back and reprocess from any point. Useful for event sourcing, audit trails, and data pipelines.

The Contenders
#

RabbitMQ — The Swiss Army Knife
#

RabbitMQ is a message broker. It receives messages, routes them based on rules, delivers them to consumers, and deletes them once acknowledged. Think of it as a smart post office.

The routing is where RabbitMQ shines. Messages go to an exchange, which routes them to queues based on rules:

Exchange TypeWhat It DoesExample
DirectExact match on routing key“Send to the payment queue”
FanoutBroadcast to all bound queues“Notify all subscribers”
TopicPattern matchingorder.* matches order.created, order.cancelled

I’ve touched RabbitMQ, and the mental model clicks pretty quickly if you think of it as: messages go in, routing rules decide where they end up, consumers pull from their queue. Once a message is acknowledged, it’s gone.

Throughput: 10K-50K messages/second. Plenty for most applications.

Apache Kafka — The Distributed Log
#

Kafka is not a message broker. It’s a distributed, append-only log that happens to support messaging patterns. That distinction matters.

In Kafka, messages are written to partitions within topics and stay there until the retention period expires (default: 7 days). Consumers don’t “receive” messages — they pull from the log and track their own position (offset).

The parallelism model is built around partitions:

  • A topic with 10 partitions can have up to 10 consumers reading in parallel
  • Each partition is read by exactly one consumer in a group
  • Different consumer groups read independently (each gets all messages)

Throughput: millions of messages/second per broker. Built for massive scale.

Google Cloud Pub/Sub — The Managed Option
#

If you’re on GCP and want pub/sub without managing infrastructure, this is it. No brokers, no clusters, no partitions to configure. Create a topic, publish messages, create subscriptions, consume. Google handles scaling.

Throughput: millions of messages/second, auto-scaled. You pay per message (~$0.04 per million).

The Comparison That Actually Matters
#

QuestionRabbitMQKafkaCloud Pub/Sub
Can I replay messages?No (deleted after ACK)Yes (offset-based)Yes (timestamp-based)
Complex routing?Yes (exchanges, bindings)No (topics only)No (topics only)
Throughput10K-50K msg/sMillions msg/sMillions msg/s
Operations burdenMediumHighZero
Message orderingPer queue (FIFO)Per partitionPer ordering key
Consumer modelPush (broker delivers)Pull (consumer fetches)Both

So When Do You Actually Need Kafka?
#

Here’s the thing. Kafka was built at LinkedIn to handle trillions of events per day across a distributed system. If that sounds like your startup with 500 users… it’s not.

You need Kafka when:

  • You’re processing millions of messages per second (not thousands — millions)
  • Multiple independent systems need to replay the same event stream
  • You’re building real-time data pipelines (ETL, analytics, stream processing)
  • You need event sourcing — a complete, replayable history of everything that happened
  • You have a team that can operate a Kafka cluster (or you’re paying for Confluent Cloud)

You don’t need Kafka when:

  • You’re distributing background tasks → use a task queue (Redis-based or database-backed)
  • You need simple pub/sub with a few subscribers → RabbitMQ or Cloud Pub/Sub
  • You’re processing < 50K messages/second → RabbitMQ handles this fine
  • You don’t have a team to manage Kafka infrastructure → Cloud Pub/Sub
  • “Someone said we should use Kafka” → that’s not a requirement

The Scaling Ladder
#

"I need to send emails in the background"
  → Redis queue (BullMQ, Sidekiq) or database-backed (Solid Queue)

"I need services to communicate asynchronously"
  → RabbitMQ (self-hosted) or Cloud Pub/Sub (managed)

"I need complex routing — different messages to different consumers based on patterns"
  → RabbitMQ (this is its specialty)

"I need millions of messages per second with replay capability"
  → Kafka (self-managed) or Confluent Cloud (managed)

"I need all of the above and I'm on GCP"
  → Cloud Pub/Sub for most things, Kafka only for the streaming use case

The Pattern That Trips People Up
#

The Saga pattern — distributed transactions across microservices using events:

Order Service → [OrderCreated] → Payment Service
Payment Service → [PaymentProcessed] → Inventory Service
Inventory Service → [ItemReserved] → Shipping Service

Something fails?
→ Compensating events flow backward to undo each step

This is where queues genuinely shine. But you can implement Saga with RabbitMQ just as well as with Kafka. The pattern is about the architecture, not the specific queue technology. Don’t let someone convince you that you need Kafka for event-driven architecture — you need a message broker. Which one depends on your actual throughput and replay requirements.

The Honest Take
#

I don’t have deep production experience with any of these tools. I’ve touched RabbitMQ enough to understand the mental model. I know what Kafka is and when it makes sense. But the most useful thing I’ve learned from researching all of this is the decision framework — knowing which layer of the stack solves which problem, and not reaching for the most complex tool just because it’s the one everyone talks about.

Most applications will never outgrow a Redis-backed task queue. Many that need pub/sub will do fine with RabbitMQ or a managed service. Kafka is for a specific class of problems at a specific scale. If you’re not sure whether you need it, you probably don’t.

Aaron Yong
Author
Aaron Yong
Building things for the web. Writing about development, Linux, cloud, and everything in between.

Related

Things I Changed My Mind On
·1353 words·7 mins
Photograph By ThisisEngineering - Unsplash
Blog Software Engineering
Opinions that didn’t survive contact with production
Scaling the Right Thing
·1336 words·7 mins
Photograph By Kenny Eliason
Blog Kubernetes Infrastructure
We scaled the wrong layer first and learned that scaling is a diagnostic skill, not a shopping spree
Making Sense of Metrics
·753 words·4 mins
Photograph By 1981 Digital
Blog Grafana Monitoring
Grafana dashboards, the RED method, and why provisioning matters