Version-Aware Decision Cache
Summary
The Access Gateway uses a distributed cache to reduce latency and minimize load on downstream authorities. The cache stores permission decisions, exchanged tokens, and organization context. All caches use subject-based keys — meaning that a token refresh for the same user produces a cache hit, eliminating redundant authority calls.
Why It Exists
Without caching, every permission check would require multiple network calls to downstream authorities. At high request volumes, this creates a scalability bottleneck. The caching layer:
- Reduces latency: Cache hits return immediately instead of requiring authority calls.
- Protects authorities: Acts as a burst breaker, absorbing repetitive checks for the same user and resource combination.
- Supports negative caching: Prevents repeated lookups for users without organization assignments.
Where It Fits in Keymate
Caching is an integral part of the enforcement pipeline. Valid requests are checked against the cache before calling the authorization authority. A cache hit returns the previous decision immediately — no authority call is needed.
Boundaries
What the cache covers:
- Permission decisions (GRANT only)
- Exchanged tokens (RFC 8693)
- Organization context (including negative results)
- DPoP replay protection
What the cache does not cover:
- Policy definitions or rule configurations
- Session data or user profiles
How It Works
The cache follows three key principles:
- GRANT only — only GRANT decisions are cached. DENY results are never cached, ensuring that permission changes (granting new access) take effect immediately without waiting for cache expiration.
- Subject-based keys — cache keys are derived from the user's identity rather than the token itself. This means a token refresh for the same user produces a cache hit, eliminating redundant authority calls.
- Non-blocking — cache read/write failures never block the request. A cache failure gracefully falls through to the authority call. The exception is DPoP replay protection, which is fail-closed for security.
Example Scenario
Scenario
A user makes two consecutive permission checks for the same resource. The first triggers a full authority call; the second returns from cache.
Input
- Request 1: User
user-123, clientdemo-spa, resourceemployee-data:can-view - Request 2: Same parameters, 3 seconds later (with a refreshed token)
Expected Outcome
- Request 1: Cache miss — authority call → GRANT → cached
- Request 2: Cache hit — cached GRANT returned immediately (no authority call)
- Why: Both requests produce the same cache key because the user's identity is identical, even though the token changed
Common Misunderstandings
- "A cache hit skips validation." — No. The cache is consulted only after the request has been fully validated. A cached result only skips the authority call.
- "DENY results are cached too." — No. Only GRANT results are cached, ensuring that newly granted permissions take effect immediately.
- "A cache failure means the request fails." — Cache failures are non-blocking (graceful fallthrough). The exception is DPoP replay protection, which is fail-closed for security.
Design Notes / Best Practices
- Monitor cache hit ratio through observability tooling. A consistently low hit ratio may indicate that the cache TTL is too short for your workload.
- Distributed cache health directly impacts DPoP security. Ensure monitoring and alerting are in place for the cache infrastructure.
If permission changes need to take effect faster than the cache TTL allows, consider reducing the TTL. This trades higher authority call volume for fresher decisions.