Cardinality Limiter

class sentry_redis_tools.cardinality_limiter.RedisCardinalityLimiter(client, num_shards=3, num_physical_shards=3, metric_tags=None, metrics_backend=None)

The Redis cardinality limiter stores a key per unit hash, and adds the unit hash to a set key prefixed with RequestedQuota.prefix. The memory usage grows with the dimensionality of prefix and the configured quota limit per prefix (i.e. the maximum cardinality of GrantedQuota.granted_unit_hashes).

Many design decisions for this cardinality limiter were copied from the sliding_windows rate limiter. You will find the next couple of paragraphs in its documentation as well.

Why is checking quotas and using quotas two separate implementations? Isn’t that a time-of-check-time-of-use bug, and allows me to over-spend quota when requests are happening concurrently?

It’s desirable to first check quotas, then do a potentially fallible operation, then consume quotas. This rate limiter is primarily going to be used inside of the metrics string indexer to rate-limit database writes. What we want to do there is: read DB, check rate limits, write to DB, use rate limits.

If we did it any other way (the obvious one being to read DB, check-and-use rate limits, write DB), crashes while writing to the database can over-consume quotas. This is not a big problem if those crashes are flukes, and especially not a problem if the crashes are a result of an overloaded DB.

It is however a problem in case the consumer is crash-looping, or crashing (quickly) for 100% of requests (e.g. due to schema mismatches between code and DB that somehow don’t surface during the DB read). In that case the quotas would be consumed immediately and incident recovery would require us to reset all quotas manually (or disable rate limiting via some killswitch)

The redis backend (really the only backend we care about) already has some consistency problems.
1. Redis only provides strong consistency and ability to check-and-increment counters when all involved keys hit the same Redis node. That means that a quota with prefix=”org_id:123” can only run on a single redis node. It also means that a global quota (prefix=”global”) would have to run on a single (unchangeable) redis node to be strongly consistent. That’s however a problem for scalability.
  
  There’s no obvious way to make global quotas consistent with per-org quotas this way, so right now it means that requests containing multiple quotas with different prefixes cannot be checked-and-incremented atomically even if we were to change the rate-limiter’s interface.
2. This is easily fixable, but because of the above, we currently don’t control Redis sharding at all, meaning that even keys within a single quota’s window will hit different Redis node. This also helps further distribute the load internally.
  
  Since we have given up on atomic check-and-increments in general anyway, there’s no reason to explicitly control sharding.