FAQ - Spring Boot Redis Rate Limiter

General questions

What is the difference between this and API gateway rate limiting?

This starter provides method-level rate limiting inside your Spring Boot services, while API gateways provide edge-level rate limiting at the network boundary.Use cases for this starter:

Protect expensive internal operations (database queries, external API calls)
Enforce business-specific limits (per-tenant, per-feature)
Rate limit based on business context (user role, subscription tier)
Protect service-to-service calls in microservices

Use cases for API gateway:

Protect against DDoS attacks
Enforce global API quotas
Rate limit based on IP address or API key
Protect all backend services uniformly

Best practice: Use both together. Gateway for edge protection, this starter for fine-grained business logic protection.

Which Spring Boot and Java versions are supported?

Requirements:

Java 17 or higher
Spring Boot 3.x
Redis 2.6+ (tested with Redis 7)

Not supported:

Spring Boot 2.x (use Spring Boot 3.x)
Java 11 or earlier (upgrade to Java 17+)

Does this work with Spring WebFlux/Reactive?

Currently, the starter is designed for Spring MVC (servlet-based) applications.The HTTP 429 exception handler specifically targets servlet environments. For WebFlux applications, you would need to:

Use the core rate limiting functionality
Implement custom exception handling for reactive contexts
Consider contributing WebFlux support to the project

Can I use this without Redis?

No, Redis is required. The starter uses Redis as the backing store for rate limit counters.Why Redis:

Fast atomic operations (INCR)
Built-in TTL support for automatic key expiration
Shared state across multiple application instances
Production-proven for rate limiting use cases

Alternatives:

For in-memory rate limiting (single instance), consider Bucket4j or Guava RateLimiter
For other distributed backends, you would need to implement custom RateLimiter interface

Configuration questions

What happens when Redis is unavailable?

The behavior depends on the fail-open configuration:Fail-closed (default):

ratelimiter.fail-open=false

Throws RateLimiterBackendException
Request is rejected
Safe for critical rate limiting

Fail-open:

ratelimiter.fail-open=true

Logs error but allows request
Useful for non-critical rate limiting
Prevents Redis outage from breaking your service

Recommendation: Use fail-closed for production unless you have strong reasons to fail-open.

How do I disable rate limiting globally?

Set the master toggle in your configuration:

ratelimiter.enabled=false

This disables the entire auto-configuration. No rate limiting will be enforced, even if methods have @RateLimit annotations.Alternative: Use Spring profiles to disable in specific environments:

# application-dev.yml
ratelimiter:
  enabled: false

# application-prod.yml
ratelimiter:
  enabled: true

Can I disable rate limiting for specific methods?

Yes, use the enabled attribute on the annotation:

@RateLimit(
    name = "test-operation",
    scope = "GLOBAL",
    limit = 10,
    duration = 1,
    timeUnit = TimeUnit.MINUTES,
    enabled = false  // Disables this specific rate limit
)
public void someMethod() {
  // Not rate limited
}

This is useful for:

Temporarily disabling limits during testing
A/B testing rate limit configurations
Gradual rollout of rate limiting

What rate limit headers are included in 429 responses?

When ratelimiter.include-http-headers=true (default), the following headers are added:

Retry-After: Seconds until the rate limit window resets
RateLimit-Limit: Maximum requests allowed in the window
RateLimit-Remaining: Requests remaining (0 when blocked)
RateLimit-Reset: Unix timestamp when the window resets

Example response:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
RateLimit-Limit: 10
RateLimit-Remaining: 0
RateLimit-Reset: 1709473245
Content-Type: application/problem+json

{
  "type": "about:blank",
  "title": "Rate limit exceeded",
  "status": 429,
  "detail": "Rate limit exceeded: invoice-create"
}

To disable headers:

ratelimiter.include-http-headers=false

Implementation questions

How does the key resolution work?

Keys are resolved by RateLimitKeyResolver implementations.Default resolver (DefaultRateLimitKeyResolver):

If key is set: scope:key
If key is not set: scope:fully.qualified.ClassName#methodName

Examples:

// key = "global:com.example.BillingService#createInvoice"
@RateLimit(scope = "GLOBAL", limit = 10, duration = 1, timeUnit = TimeUnit.MINUTES)
public void createInvoice() { }

// key = "user:invoice-api"
@RateLimit(scope = "USER", key = "invoice-api", limit = 10, duration = 1, timeUnit = TimeUnit.MINUTES)
public void createInvoice() { }

Custom resolver:

@Component
public class UserIdKeyResolver implements RateLimitKeyResolver {
  @Override
  public String resolveKey(RateLimitContext context) {
    String userId = (String) context.getArguments()[0];
    return "user:" + userId + ":" + context.getMethod().getName();
  }
}

Can I apply multiple rate limits to the same method?

Not directly with a single annotation, but you can layer limits using wrapper methods:

@Service
public class ApiService {

  // Public method with per-minute limit
  @RateLimit(
      name = "api-per-minute",
      scope = "USER",
      keyResolver = UserIdKeyResolver.class,
      limit = 10,
      duration = 1,
      timeUnit = TimeUnit.MINUTES
  )
  public String apiCall(String userId, String request) {
    return apiCallHourly(userId, request);
  }

  // Internal method with per-hour limit
  @RateLimit(
      name = "api-per-hour",
      scope = "USER",
      keyResolver = UserIdKeyResolver.class,
      limit = 100,
      duration = 1,
      timeUnit = TimeUnit.HOURS
  )
  public String apiCallHourly(String userId, String request) {
    return processRequest(request);
  }

  private String processRequest(String request) {
    // Actual logic
    return "result";
  }
}

Both limits must be satisfied for the request to proceed.

How do I rate limit based on Spring Security user?

Create a key resolver that extracts the user from Spring Security context:

import io.github.v4runsharma.ratelimiter.core.RateLimitContext;
import io.github.v4runsharma.ratelimiter.key.RateLimitKeyResolver;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.stereotype.Component;

@Component
public class SecurityUserKeyResolver implements RateLimitKeyResolver {
  @Override
  public String resolveKey(RateLimitContext context) {
    Authentication auth = SecurityContextHolder.getContext().getAuthentication();
    if (auth != null && auth.isAuthenticated()) {
      String username = auth.getName();
      return "user:" + username + ":" + context.getMethod().getName();
    }
    return "anonymous:" + context.getMethod().getName();
  }
}

Apply in your service:

@RateLimit(
    name = "user-operation",
    scope = "USER",
    keyResolver = SecurityUserKeyResolver.class,
    limit = 50,
    duration = 1,
    timeUnit = TimeUnit.HOURS
)
public void securedOperation() {
  // Rate limited per authenticated user
}

Can I customize the HTTP 429 response format?

Yes, you can override the default exception handler:

import io.github.v4runsharma.ratelimiter.exception.RateLimitExceededException;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;

@RestControllerAdvice
public class CustomRateLimitExceptionHandler {

  @ExceptionHandler(RateLimitExceededException.class)
  public ResponseEntity<CustomErrorResponse> handleRateLimit(
      RateLimitExceededException ex) {
    
    CustomErrorResponse response = new CustomErrorResponse(
        "RATE_LIMIT_EXCEEDED",
        "Too many requests. Please try again later.",
        ex.getRetryAfterSeconds()
    );

    return ResponseEntity
        .status(HttpStatus.TOO_MANY_REQUESTS)
        .header("Retry-After", String.valueOf(ex.getRetryAfterSeconds()))
        .body(response);
  }
}

record CustomErrorResponse(
    String errorCode,
    String message,
    long retryAfter
) {}

Performance and scalability questions

What is the performance impact of rate limiting?

The overhead is minimal:Per request:

1-2 Redis commands (INCR, possibly EXPIRE)
Typical latency: 1-5ms for local Redis, 10-20ms for remote
No Lua scripts, just simple atomic operations

Metrics: Use the built-in ratelimiter.evaluate.latency timer to monitor performance:

ratelimiter.metrics-enabled=true

Optimization tips:

Use Redis in the same data center/region
Configure connection pooling appropriately
Monitor Redis memory and CPU usage

How many requests can this handle?

The throughput is primarily limited by Redis performance:Redis single instance:

50,000-100,000+ ops/sec typical
INCR operations are very fast (O(1))

Bottlenecks:

Network latency to Redis
Redis server resources (CPU, memory, network)
Application server resources

Scaling strategies:

Use Redis Cluster for horizontal scaling
Use Redis Sentinel for high availability
Consider Redis caching for read-heavy workloads
Use multiple Redis instances with sharding

Does this work in a multi-instance deployment?

Yes, that’s the primary use case.Redis provides shared state across all application instances. Rate limits are enforced consistently regardless of which instance handles the request.Architecture:

[App Instance 1] ──┐
[App Instance 2] ──┼──> [Redis] (shared rate limit state)
[App Instance 3] ──┘

No special configuration needed - just ensure all instances use the same Redis.

Monitoring and metrics questions

What metrics are available?

When Micrometer is on the classpath and ratelimiter.metrics-enabled=true:Counters:

ratelimiter.requests - Total requests evaluated
- Tags: name, scope, outcome (allowed/blocked)
ratelimiter.errors - Backend errors (Redis failures)
- Tags: exception

Timers:

ratelimiter.evaluate.latency - Time to evaluate rate limit
- Tags: name, scope

Example Prometheus query:

# Rate of blocked requests
rate(ratelimiter_requests_total{outcome="blocked"}[5m])

# 95th percentile latency
histogram_quantile(0.95, ratelimiter_evaluate_latency_seconds_bucket)

How do I monitor rate limit effectiveness?

Key metrics to track:

Block rate:

rate(ratelimiter_requests_total{outcome="blocked"}[5m])
/ rate(ratelimiter_requests_total[5m])

Top blocked operations:

topk(10, sum by (name) (ratelimiter_requests_total{outcome="blocked"}))

Redis errors:
```
rate(ratelimiter_errors_total[5m])
```

Latency trends:

histogram_quantile(0.99, ratelimiter_evaluate_latency_seconds_bucket)

Set up alerts for:

High block rate (may indicate attack or misconfigured limits)
Redis connection errors
High latency (Redis performance issues)

Additional Resources

​General questions

​Configuration questions

​Implementation questions

​Performance and scalability questions

​Monitoring and metrics questions

General questions

Configuration questions

Implementation questions

Performance and scalability questions

Monitoring and metrics questions