Introduction
API Gateway circuit breakers protect backend services from cascading failures. When the circuit opens, all requests fail immediately without attempting to reach the backend, causing service unavailability until the circuit closes.
Symptoms
Circuit open error:
{
"error": "CircuitBreakerOpen",
"message": "Circuit breaker is open for service 'payment-service'",
"timestamp": "2024-04-15T10:00:00Z"
}API Gateway response:
```bash $ curl https://api.example.com/payments
HTTP/1.1 503 Service Unavailable { "error": "Service temporarily unavailable", "reason": "circuit_breaker_open" } ```
Monitoring alert:
Circuit breaker 'payment-service' is OPEN
Failure rate: 75% (threshold: 50%)
Last failure: Connection refused to 10.0.0.5:8080Common Causes
- 1.Backend service down - Target service not running or unreachable
- 2.High error rate - Exceeding failure threshold
- 3.Timeout issues - Slow responses triggering timeouts
- 4.Network connectivity - Network partition between gateway and backend
- 5.Resource exhaustion - Backend overloaded or out of resources
- 6.Configuration error - Wrong endpoint or health check URL
- 7.Inadequate thresholds - Too sensitive circuit breaker settings
Step-by-Step Fix
Step 1: Check Circuit Breaker Status
```bash # Check circuit breaker state (varies by gateway) # Kong: curl -s http://localhost:8001/plugins | jq '.data[] | select(.name=="circuit-breaker")'
# AWS API Gateway - check CloudWatch metrics: aws cloudwatch get-metric-statistics \ --namespace AWS/ApiGateway \ --metric-name 5XXError \ --dimensions Name=ApiName,Value=my-api \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 60 \ --statistics Sum
# Check application logs kubectl logs -l app=api-gateway | grep -i "circuit" ```
Step 2: Verify Backend Service Health
```bash # Check if backend is running curl -I http://backend-service:8080/health
# Check service endpoints nslookup backend-service dig backend-service
# Test direct connection nc -zv backend-service 8080
# Check Kubernetes pods kubectl get pods -l app=backend-service kubectl describe pod backend-service-xxx ```
Step 3: Check Backend Logs
```bash # Check backend application logs kubectl logs -l app=backend-service --tail=100
# Check for errors kubectl logs -l app=backend-service | grep -i "error|exception|failed"
# Check resource usage kubectl top pods -l app=backend-service kubectl describe pod backend-service-xxx | grep -A5 "Limits:|Requests:" ```
Step 4: Check Error Thresholds
```yaml # Common circuit breaker configurations:
# Kong: plugins: - name: circuit-breaker config: error_threshold: 50 # 50% error rate triggers open volume_threshold: 10 # Minimum requests before evaluating half_open_timeout: 30000 # 30 seconds before retry
# Spring Cloud Gateway: resilience4j: circuitbreaker: instances: payment-service: failureRateThreshold: 50 slowCallDurationThreshold: 2s slowCallRateThreshold: 100 permittedNumberOfCallsInHalfOpenState: 3 slidingWindowSize: 10
# AWS App Mesh: circuitBreaker: maxConnections: 100 maxPendingRequests: 100 maxRequests: 100 maxRetries: 3 ```
Step 5: Manually Close Circuit (Emergency)
```bash # Kong - disable plugin temporarily curl -X PATCH http://localhost:8001/plugins/{plugin_id} \ -d "config.enabled=false"
# Spring Cloud - actuator endpoint curl -X POST http://localhost:8080/actuator/circuitbreakers/payment-service/close
# Custom circuit breaker reset curl -X POST http://api-gateway:8080/admin/circuit-breaker/payment-service/reset ```
Step 6: Fix Underlying Backend Issues
```bash # If backend crashed, restart it kubectl rollout restart deployment/backend-service
# If overloaded, scale up kubectl scale deployment backend-service --replicas=5
# Check database connections kubectl exec backend-service-xxx -- netstat -an | grep ESTABLISHED | wc -l
# Check for memory issues kubectl logs backend-service-xxx | grep -i "OutOfMemoryError|memory" ```
Step 7: Adjust Timeout Settings
```yaml # Increase timeout to prevent premature circuit opening # Kong: plugins: - name: circuit-breaker config: timeout: 60000 # 60 second timeout
# Nginx: location /api/ { proxy_pass http://backend; proxy_connect_timeout 60s; proxy_read_timeout 60s; proxy_send_timeout 60s; }
# Envoy: clusters: - name: backend_service connect_timeout: 60s per_connection_buffer_limit_bytes: 32768 ```
Step 8: Implement Graceful Degradation
// Return cached or fallback response when circuit open
async function callPaymentService(order) {
try {
return await paymentService.process(order);
} catch (CircuitOpenError) {
// Fallback: queue for later processing
await queueService.enqueue('payments', order);
return {
status: 'queued',
message: 'Payment will be processed shortly'
};
}
}Step 9: Configure Proper Recovery
```yaml # Half-open state configuration for safe recovery circuitBreaker: failureRateThreshold: 50 slowCallDurationThreshold: 2s permittedNumberOfCallsInHalfOpenState: 5 slidingWindowType: COUNT_BASED slidingWindowSize: 10 minimumNumberOfCalls: 5 waitDurationInOpenState: 30s # Wait before trying half-open
# Automatic recovery monitoring management: endpoints: web: exposure: include: circuitbreakers,health endpoint: health: show-details: always ```
Step 10: Set Up Monitoring and Alerts
```yaml # Prometheus metrics for circuit breaker # application.yml management: metrics: export: prometheus: enabled: true endpoints: web: exposure: include: prometheus,health
# Prometheus alert rule groups: - name: circuit_breaker rules: - alert: CircuitBreakerOpen expr: resilience4j_circuitbreaker_state{state="open"} == 1 for: 1m labels: severity: critical annotations: summary: "Circuit breaker {{ $labels.name }} is open" ```
```bash # Grafana dashboard query resilience4j_circuitbreaker_state{state="open"} resilience4j_circuitbreaker_calls_seconds_count{kind="failed"}
# Check circuit breaker metrics curl http://localhost:8080/actuator/prometheus | grep circuitbreaker ```
Circuit Breaker States
| State | Behavior | Recovery |
|---|---|---|
| Closed | Requests pass through | Normal operation |
| Open | Requests fail immediately | Wait for timeout |
| Half-Open | Limited test requests | Success → Closed, Fail → Open |
| Forced-Open | Manual open | Manual reset required |
Verification
```bash # After fixing backend and adjusting settings # Check circuit breaker status curl http://localhost:8080/actuator/circuitbreakers
# Should show: # {"circuitBreakers":{"payment-service":{"state":"CLOSED"}}}
# Test API endpoint curl https://api.example.com/payments
# Should return 200 OK
# Monitor failure rate curl http://localhost:8080/actuator/prometheus | grep circuitbreaker_failure_rate
# Should be below threshold ```
Prevention
To prevent API gateway circuit breaker open issues from recurring, implement these proactive measures:
1. Monitor Circuit Breaker State
groups:
- name: api-gateway
rules:
- alert: CircuitBreakerOpen
expr: |
resilience4j_circuitbreaker_state{state="open"} == 1
for: 1m
labels:
severity: critical
annotations:
summary: "Circuit breaker {{ $labels.name }} is open"2. Configure Appropriate Thresholds
# application.yml - Resilience4j configuration
resilience4j:
circuitbreaker:
configs:
default:
failureRateThreshold: 50
slowCallRateThreshold: 100
slowCallDurationThreshold: 2s
minimumNumberOfCalls: 10
waitDurationInOpenState: 30s
permittedNumberOfCallsInHalfOpenState: 3
slidingWindowType: COUNT_BASED
slidingWindowSize: 1003. Implement Health Endpoints
```java // Spring Boot health check for backend @RestController public class HealthController { @GetMapping("/health") public ResponseEntity<String> health() { // Check database connectivity // Check downstream services return ResponseEntity.ok("OK"); } }
// Configure gateway to check health before routing ```
Best Practices Checklist
- [ ] Monitor circuit breaker state
- [ ] Configure appropriate thresholds
- [ ] Implement health endpoints
- [ ] Test backend services regularly
- [ ] Use fallback strategies
- [ ] Document circuit breaker configuration
Related Issues
- [Fix API Gateway Timeout Backend Slow](/articles/fix-api-gateway-timeout-backend-slow)
- [Fix API Gateway Authentication Bypass](/articles/fix-api-gateway-authentication-bypass)
- [Fix API Gateway SSL Termination Failed](/articles/fix-api-gateway-ssl-termination-failed)
Related Articles
- [WordPress troubleshooting: Fix IAM Permission Denied - Complete Tro](fix-iam-permission-denied-d4at)
- [WordPress troubleshooting: Fix IAM Access Denied 403 - Complete Tro](fix-iam-access-denied-403-ywdw)
- [WordPress troubleshooting: Fix ELB Permission Denied - Complete Tro](fix-elb-permission-denied-1h5w)
- [WordPress troubleshooting: Fix IAM Timeout Error - Complete Trouble](fix-iam-timeout-error-i8br)
- [WordPress troubleshooting: Fix IAM Access Denied 403 - Complete Tro](fix-iam-access-denied-403-5gzy)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix API Gateway Circuit Breaker Open", "description": "Troubleshoot API gateway circuit breaker issues. Fix backend failures, adjust thresholds, and implement proper recovery.", "url": "https://www.fixwikihub.com/fix-api-gateway-circuit-open", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-04-03T18:08:51.121Z", "dateModified": "2026-04-03T18:08:51.121Z" } </script>