Rate Limiter Pattern on DData (PNCounterMap with Sharding + Rotation) * apache/pekko * Discussion #2053 * GitHub

hanishi
Aug 23, 2025

I've been thinking of building a cluster-wide, approximate rate limiter using Pekko Distributed Data (PNCounterMap).

Basic idea

Time-bucketed windows: derive windowId = now / windowMs.
Sharding: hash the key (e.g. "ip:1.2.3.4" or "imp|pub|slot") into N buckets - multiple PNCounterMapKeys per window.
2.1 Operation:
- On Allow(key): read counter for this (windowId, bucket).
- If count(key) < capacity - increment and return true.
- Otherwise - return false.
Consistency: use ReadLocal / WriteLocal for speed; accept small overshoot under race because state converges via gossip.
Rotation / expiration: only keep now and now-1 windows live; stop touching older windows so they naturally expire from gossip (avoid mass per-entry remove() and tombstone buildup).

Why

Designed for things like impression/click throttling in an ad network.
Goal is best-effort flood control, not strict global counters.
Avoids external systems (Redis/Kafka). Runs cluster-wide with only DData.

Questions for the community

Has anyone else used this shard + rotate pattern to manage high-cardinality keys with DData?
Any pitfalls around tombstones or gossip overhead when rotating many CRDT keys per minute/hour?
Roughly how many concurrent CRDT instances (PNCounterMapKeys) are considered safe in production practice?

Replies: 1 comment 7 replies

He-Pin
Aug 23, 2025
Collaborator

I think better go with redis

7 replies

hanishi Aug 23, 2025
Author

On second thought, I may not even need cluster-wide rate limiting. Each serve URL is already HMAC-signed and contains a valid-until timestamp. That means: expired URLs are automatically rejected, and replays outside the window can't pass. So I could just accept impressions/clicks within that short validity window, tolerate duplicates there and rely on probabilistic aggregation downstream for analytics like HLL. This would keep the system much simpler and avoid gossip/tombstone overhead altogether.
Still, it seems interesting to experiment with adding a DData-based limiter on top, just to see how far the shard+rotate pattern can go. Even if it's not essential, it could be a good case study of DData feasibility under high-churn keys.

hanishi Aug 24, 2025
Author

I'm experimenting with a dedicated project that uses JMH to see how it goes. I will share it when I am done, if anyone is interested.

hanishi Aug 25, 2025
Author

It's just a simulation harness to see how a DData-based rate limiter could work in practice.
The goal is to observe denial patterns and cleanup behavior.

https://github.com/hanishi/ddata-limiter-simulation

hanishi Sep 9, 2025
Author

What about Bloom filter based rate limiter? The idea is to implement a distributed rate-limiter and replay-guard without keeping explicit counters, but instead by windowing with Bloom filters: each shard entity manages a current and previous time window, recording whether a given nonce has been seen; if it appears again in the same window it is denied, and anything older than the previous window is automatically rejected. This approach makes grants/denials a function of time-bucketed membership rather than precise counting, and it scales cleanly across a cluster. To survive shard passivation or rebalancing, each entity periodically publishes a compact Bloom snapshot into Pekko DData, so that a new incarnation can immediately warm-start with the recent windows instead of forgetting its history. The result is a fast, memory-efficient limiter that tolerates rebalances and skew while providing strong deduplication semantics. Each shard manages a disjoint partition of the input key-space (e.g., part = crc32(key) % parts) and therefore holds the dedup state only for its slice.

hanishi Sep 21, 2025
Author

[WIP]
https://github.com/hanishi/ddata-limiter-simulation/blob/main/modules/limiter-core/src/main/scala/com/example/bloom/WindowedBloomReplayGuard.scala

You can test it with this!
https://github.com/hanishi/ddata-limiter-simulation/blob/main/modules/limiter-simulation/src/main/scala/com/example/bloom/WindowedBloomReplayGuardDriver.scala

Rate Limiter Pattern on DData (PNCounterMap with Sharding + Rotation) #2053

Uh oh!

hanishi Aug 23, 2025

Replies: 1 comment * 7 replies

Uh oh!

He-Pin Aug 23, 2025 Collaborator

Uh oh!

hanishi Aug 23, 2025 Author

Uh oh!

hanishi Aug 24, 2025 Author

Uh oh!

hanishi Aug 25, 2025 Author

Uh oh!

Uh oh!

hanishi Sep 9, 2025 Author

Uh oh!

Uh oh!

hanishi Sep 21, 2025 Author

hanishi
Aug 23, 2025

Replies: 1 comment 7 replies

He-Pin
Aug 23, 2025
Collaborator

hanishi Aug 23, 2025
Author

hanishi Aug 24, 2025
Author

hanishi Aug 25, 2025
Author

hanishi Sep 9, 2025
Author

hanishi Sep 21, 2025
Author