The tradeoff
Fail-closed
Deny requests when Redis is down
- Pros: Prevents abuse, maintains security
- Cons: Reduces availability, may cause outages
Fail-open
Allow requests when Redis is down
- Pros: Maintains availability, better user experience
- Cons: Temporarily unprotected from abuse
Configuration
You can configure the failure strategy usingRateLimiterProperties (config/RateLimiterProperties.java:21-26):
- application.properties
- application.yml
- Java Config
The default behavior is fail-closed (
failOpen = false). This is the more conservative choice that protects your backend from abuse.How it works
TheRedisRateLimiter wraps all Redis operations in a try-catch block (redis/RedisRateLimiter.java:59-78):
Normal operation
Redis operations succeed, and the rate limiter returns a decision based on the current count
Redis failure detected
Any
RuntimeException from Redis (connection timeout, command failure, etc.) is caughtFail-open behavior details
When failing open, the limiter returns a special decision object (redis/RedisRateLimiter.java:70-75):Always
true—the request is allowed throughSet to
REMAINING_TIME_UNKNOWN (-1) to indicate the rate limiter couldn’t determine the actual remaining timenull—no retry is needed since the request was allowedAn estimated reset time based on the policy’s window duration (may not be accurate since we couldn’t contact Redis)
Fail-closed behavior details
When failing closed, the limiter throws aRateLimiterBackendException (exception/RateLimiterBackendException.java):
- Spring’s exception handler (exception/RateLimitExceptionHandler.java)—returns HTTP 503 Service Unavailable
- Your custom error handler—implement custom logic
- Circuit breaker—detect repeated failures and short-circuit
RateLimiterBackendException is a RuntimeException, so it will cause the method call to fail unless caught.Decision tree: Which strategy to use?
Use case recommendations
Public API with abuse prevention
Public API with abuse prevention
Recommendation: Fail-closedIf your rate limiter is primarily for preventing abuse (DDoS, scraping, brute-force attacks), failing closed ensures that a Redis outage doesn’t leave you vulnerable.Mitigation: Run Redis in a highly available configuration (Redis Sentinel, Redis Cluster, or managed service like ElastiCache/Azure Redis).
Internal API with fair-use limits
Internal API with fair-use limits
Recommendation: Fail-openIf your rate limiter is for fair-use enforcement among internal clients (not security-critical), failing open maintains availability.Mitigation: Monitor Redis health and set up alerts for fail-open events.
Paid API with billing tiers
Paid API with billing tiers
Recommendation: Fail-closedIf rate limits are tied to billing/quotas, failing open could result in over-consumption without payment.Mitigation: Use multiple Redis instances, implement circuit breakers, and provide a status page.
High-availability requirement
High-availability requirement
Recommendation: Fail-open + Multi-region RedisIf your SLA requires 99.99% uptime, failing closed on Redis outages is not acceptable.Mitigation: Use geo-replicated Redis, implement fallback rate limiting (in-memory), and monitor closely.
Best practices
Monitor fail-open events
Track when
remainingTime == -1 in your metrics to detect Redis issues earlyUse high-availability Redis
Redis Sentinel, Redis Cluster, or managed services provide automatic failover
Implement circuit breakers
Prevent cascading failures by short-circuiting after repeated Redis errors
Set up alerts
Alert on-call engineers when the rate limiter throws
RateLimiterBackendExceptionMonitoring and observability
Detecting fail-open events
If you’re using Micrometer metrics (enabled by default), you can detect fail-open events:Alerting on fail-closed events
When failing closed, theRateLimiterBackendException should trigger alerts:
Custom error handler example
You can implement custom logic when Redis fails:Advanced: Hybrid approach
For maximum resilience, you can implement a hybrid approach:- Primary: Redis rate limiter
- Fallback: In-memory rate limiter (e.g., Guava Cache, Caffeine)
Comparison table
| Aspect | Fail-Open | Fail-Closed |
|---|---|---|
| Availability | High—requests continue flowing | Low—requests blocked during outage |
| Security | Low—temporarily unprotected | High—abuse prevented |
| User experience | Good—no service disruption | Poor—errors during outage |
| Redis dependency | Optional (graceful degradation) | Critical (hard dependency) |
| Suitable for | Internal APIs, fair-use limits | Public APIs, abuse prevention, billing |
| Monitoring needs | High—must detect fail-open events | Medium—fail-closed is obvious |
Testing failure scenarios
You should test your application’s behavior when Redis fails:Test 1: Redis connection timeout
Test 2: Redis server crash
Expected behavior
- Fail-open
- Fail-closed
- Requests should succeed (HTTP 200)
- Response headers should not include rate limit info
- Logs should show warnings about Redis unavailability
- Metrics should show
failopen.countincreasing
Next steps
Overview
Learn about the overall architecture
Rate limiting algorithm
Understand the fixed-window algorithm
Key resolution
Customize how rate limit keys are generated