# Apache Pulsar vs Kafka: Segment Storage, BookKeeper, and Tiered Storage

Kafka won the event-streaming world, and deservedly — but the thing teams complain about most isn't throughput, it's the operational shape of it. Scaling a Kafka cluster means moving partition data between brokers, because in Kafka the broker that serves a partition is also the broker that stores it. Add a broker and you rebalance terabytes; lose a broker and replicas re-replicate. **Apache Pulsar** was built around a different answer to that one structural question, and almost everything distinctive about Pulsar follows from it: *what if the servers that handle clients held no data at all?*

This is a comparison of architectures, not a winner declaration. I run Kafka in production and respect it. But Pulsar's design is genuinely different in ways worth understanding before you pick one, so I'll trace where they diverge — the broker/storage split, segment-centric storage on BookKeeper, tiered storage, and multi-tenancy — and then be honest about the cost of that flexibility. If you want the Kafka internals first, I've written them up [here](kafka-internals) and the performance story [here](kafka-performance-scalability).

## The one decision everything follows from

In Kafka, a partition is owned by a broker. That broker holds the partition's log on its local disks and serves reads and writes for it; other brokers hold replica copies. Compute and storage are fused on the same node. It's simple and it's fast — but it means the storage layer and the serving layer scale together whether you want them to or not, and rebalancing is a data-movement problem.

Pulsar splits them into two layers. **Brokers** handle clients — connections, topic ownership, dispatching messages, acknowledgements — but store nothing durably themselves. Underneath sits **Apache BookKeeper**, a separate distributed log-storage system whose nodes (called **bookies**) hold the actual data. A broker is effectively stateless: it owns a topic for now, and if it dies another broker picks up that topic instantly, because the data was never on the broker to begin with — it's safe in BookKeeper.

```mermaid
graph TD
    subgraph KAFKA["Kafka — compute and storage fused"]
        KB1["Broker 1owns + stores partition A"]
        KB2["Broker 2owns + stores partition B"]
    end
    subgraph PULSAR["Pulsar — compute and storage split"]
        PB["Brokers (stateless)own topics, dispatch, no data"]
        BK["BookKeeper bookiesstore the actual log segments"]
        PB --> BK
    end
          
```

The structural difference. In Kafka a broker both serves and stores its partitions, so scaling and recovery move partition data between brokers. In Pulsar the broker layer is stateless and sits on top of BookKeeper; a failed broker is replaced instantly because no data lived on it. The trade is one more distributed system to operate.

## Segment-centric vs partition-centric storage

The second divergence is subtler and arguably the deeper one. A Kafka partition's log is a sequence of files that live together on the owning broker's disk — the partition is the storage unit, and it's bounded by what fits on that broker. Pulsar stores a topic's log as a series of **segments** (BookKeeper ledgers), and crucially, **consecutive segments of the same topic can be spread across different bookies.** The log is not pinned to any one node.

That segment-centric model has real consequences:

- **A topic can exceed any single node's disk.** Its segments are scattered across the BookKeeper cluster, so capacity is the cluster's capacity, not one broker's.

- **Adding storage is immediate.** Bring up a new bookie and new segments start landing on it. There's no rebalancing of existing data to make a new node useful — it just starts taking writes.

- **Recovery spreads out.** When a bookie fails, the segments it held are re-replicated from copies on many other bookies in parallel, rather than one broker laboriously rebuilding a full partition.

This is the same compute/storage-separation instinct that reshaped the warehouse world — the idea behind [Snowflake's architecture](snowflake-internals) — applied to a streaming log. You decouple the thing that scales with traffic (serving) from the thing that scales with retention (storage).

The clean way to hold the difference in your head: **Kafka asks "which broker owns this partition's data?"; Pulsar asks "which bookies hold this topic's current segments?"** Kafka's answer is one node and changes rarely (and expensively). Pulsar's answer is many nodes and changes constantly and cheaply. Nearly every operational contrast between the two traces back to that.

## Tiered storage: retention without buying brokers

Because Pulsar's storage is already a separate, segment-based layer, offloading old segments to cheap object storage was a natural extension rather than a bolt-on. **Tiered storage** moves sealed segments to S3, GCS, or Azure Blob automatically once they age past a threshold, while the topic stays fully readable — consumers reading old data are served transparently from object storage. You keep effectively unbounded history without sizing your bookie disks for it, which matters a lot for the "stream as the system of record, replay from the beginning" pattern.

This was Pulsar's advantage at the time, and it's worth being precise about it: Kafka's equivalent capability for offloading log segments to object storage is in active development as KIP-405 (Tiered Storage) but isn't yet generally available in open-source Kafka. So as of early 2022, "unbounded retention on cheap storage, built in" is a genuine Pulsar differentiator, not marketing.

## Multi-tenancy and the messaging models

Pulsar was designed at Yahoo to be a single platform serving many teams, so multi-tenancy is built into the namespace hierarchy — `tenant/namespace/topic` — with isolation, quotas, and authorization at each level. Running one Pulsar cluster for the whole org with hard tenant boundaries is a first-class scenario; doing the equivalent on Kafka leans on conventions and external tooling.

Pulsar also unifies two messaging styles that are usually separate products. Beyond the Kafka-style streaming (replayable log, consumer groups), it offers **queue-style** subscriptions where messages are distributed among consumers and individually acknowledged — closer to RabbitMQ. The subscription modes make this explicit:

| Subscription mode | Behavior | Analogous to |
| --- | --- | --- |
| Exclusive | One consumer reads the whole topic | Single-reader log |
| Failover | One active consumer, standby takes over on failure | HA log consumer |
| Shared | Messages round-robin across many consumers, each ack'd individually | Work queue (RabbitMQ) |
| Key_Shared | Shared, but same key always goes to same consumer | Ordered work queue |

Individual acknowledgement in Shared mode is something Kafka's offset-based model can't do cleanly — Kafka tracks a single advancing offset per partition, not per-message acks. If your workload is really a work queue with out-of-order completion, Pulsar fits it natively where Kafka makes you contort.

**The flexibility has a bill, and it's operational.** Pulsar is two distributed systems — the broker layer and BookKeeper — each with its own failure modes, tuning, and metadata (both lean on ZooKeeper at this point). That's more moving parts to understand, monitor, and debug than Kafka's single broker tier, and the talent pool and battle-tested tooling around Kafka are far larger. I've seen the broker/bookie split pay for itself at genuine scale and with real multi-tenant or unbounded-retention needs. I've also seen teams adopt Pulsar for a workload Kafka would have handled on autopilot, and spend their first six months learning BookKeeper instead of shipping. Match the architecture to a need you actually have.

## The comparison, condensed

| Dimension | Kafka | Pulsar |
| --- | --- | --- |
| Storage model | Partition pinned to broker (compute + storage fused) | Segments over BookKeeper (compute/storage split) |
| Scaling storage | Add broker, rebalance partition data | Add bookie, new segments land immediately |
| Broker failure | Replica promoted; re-replication of partitions | Stateless broker replaced instantly |
| Tiered storage | In progress (KIP-405), not yet GA in OSS | Built in |
| Messaging models | Streaming log | Streaming log + queue (per-message ack) |
| Multi-tenancy | By convention + external tooling | First-class (tenant/namespace) |
| Operational surface | One broker tier + ZooKeeper (KRaft emerging) | Brokers + BookKeeper + ZooKeeper |
| Ecosystem / maturity | Vast — connectors, tooling, expertise | Smaller but capable and growing |

## What to carry away

Pulsar and Kafka answer the same question — durable, ordered, replayable event streams — with a different structural decision. Kafka **fuses compute and storage** on the broker, which is simple and fast and makes scaling a data-movement problem. Pulsar **separates them**: stateless brokers over BookKeeper, with segment-centric storage that isn't pinned to any node. From that one split fall its real advantages — instant broker replacement, storage you can grow without rebalancing, built-in tiered storage for unbounded retention, native queue semantics, and first-class multi-tenancy.

The cost is just as real: another distributed system to run, a smaller ecosystem, and fewer people who've operated it at 3 a.m. Reach for Pulsar when you have a concrete need its architecture serves — multi-tenant consolidation, unbounded retention on cheap storage, mixed queue-and-stream workloads. If your need is "a fast, durable log," Kafka's fused model is less to operate and the safer default. The interesting thing about 2022 is how much each is borrowing from the other — Kafka chasing tiered storage and shedding ZooKeeper, Pulsar hardening its ecosystem — which tells you the architectural debate isn't settled yet.
