Consuming CDC Events: Patterns for Building Read Models
Event handling, idempotency, and recovery strategies for CDC consumers
From Infrastructure to Implementation
In the previous article, we covered the infrastructure side of MongoDB CDC with Debezium - the oplog retention problem, why Atlas shared tiers fail, and how to configure a production-ready connector. Now let's focus on what happens after events land in Kafka: how to consume them reliably and build read-optimized projections.
The goal is simple: react to changes in your authoritative data store and maintain derived views for specific query patterns. But "simple" doesn't mean "easy" - you'll need to handle duplicates, failures, and recovery scenarios.
Event Handling Pattern
Once events flow through Kafka, your consumers need to handle them reliably. The core pattern is straightforward:
Algorithm 1 ProcessCDCEvent
Require: : Debezium CDC event from Kafka
Ensure: Updated search index state
1:
2:
3:if then
4:
5:Upsert()
6:else if then
7:Delete()
8:end if
9:Commit()
10:return
Key Patterns
- Operation filtering: Debezium events include an
opfield -c(create),u(update),d(delete) - Auto-commit: Commit offset after successful processing; on failure, send to dead letter queue
- Idempotency: Use upsert operations - handlers must tolerate duplicate events (Kafka at-least-once delivery)
Building Read Models
The search service consumes CDC events to maintain a read-optimized index. Here's the pattern using Redis with RediSearch:
Algorithm 2 BuildSearchIndex
Require: : CDC event with user data
1:
2:
3:HSET(, ) // RediSearch auto-indexes
4:if then
5:DEL()
6:end if
The key insight: RediSearch watches hash keys with a matching prefix and automatically maintains the full-text index. Your CDC handler just needs to write/delete hashes.
Recovery Mechanism: Batch Rebuild
Even with reliable oplog retention, you need a fallback for scenarios like Redis data loss or index corruption. Implement a batch rebuild endpoint:
Algorithm 3 RebuildSearchIndex
Require: : number of records per batch
Require: : starting position
Ensure: Rebuilt search index
1: FetchFromSource() // Bypasses CDC
2:EnsureIndexExists()
3:for in do
4:Upsert()
5:end for
6:return
This runs parallel to CDC processing, allowing index recovery without stopping the real-time pipeline.
Data Ownership and Cross-Service CQRS
It's critical that the single source of truth and authoritative owner of the data remains the user service. In this architecture, we follow the database-per-microservice pattern. You could argue that search isn't a separate domain from user service - however, search has shared concerns across many services (posts, products, users) and requires information from multiple systems to build a rich search experience.
This pattern can be viewed as a form of cross-service CQRS: separating read and write models where the write model lives in the authoritative service (user service with MongoDB) and the read model lives in a specialized query service (search service with Redis). The CDC pipeline is the synchronization mechanism between these models.
Recovery as a First-Class Concern
Since we're dealing with data replication, things can and will go wrong. Data can become stale, out of sync, or faulty. Your system must be recoverable from the beginning. Design for idempotent recovery paths - the ability to rebuild any downstream read model from scratch without corrupting state or losing data.
The batch rebuild API shown above is just one example. In a later post, I'll cover how to make this architecture fully fault-tolerant: essentially treating downstream services as throwaway systems that can be recreated or recovered from an arbitrary state to a consistent state at any time.
Failure Modes and Recovery
| Scenario | Behavior | Recovery |
|---|---|---|
| Debezium connector down | CDC pauses, resume token retained | Restart connector, process backlog |
| Oplog rotation (token lost) | snapshot.mode: when_needed triggers |
Automatic re-snapshot (Atlas M30+ prevents this) |
| Search service down | Events accumulate in Kafka | Restart service, consume from last offset |
| Redis index corrupted | Search API fails | Trigger batch rebuild API |
| Handler throws error | Message sent to DLQ | Fix code, replay DLQ |
Conclusion
Building CDC consumers is fundamentally about handling uncertainty gracefully. Events may arrive out of order, duplicates will happen, and your downstream stores can fail at any time.
The patterns covered here - idempotent upserts, operation-based routing, dead letter queues, and batch rebuild APIs - form a foundation for reliable event processing. The key insight is treating your read models as disposable: if you can rebuild them from scratch at any time, you're resilient to almost any failure mode.
Combined with the infrastructure setup from Part 1, you now have a complete picture of production CDC with MongoDB, Kafka, and Debezium.