In a Microservices world, your app will eventually fail because of a network timeout or a remote service crash. Resilience Patterns ensure that your application stays alive even when its dependencies are dying. We primarily use the Polly library in .NET to implement these.
If a network call fails, don't just crash. Wait 1 second and try again. This fixes "Transient" errors (short blips in the internet).
// Try 3 times, waiting longer each time (Exponential Backoff)
var policy = Policy.Handle<Exception>()
.WaitAndRetryAsync(3, i => TimeSpan.FromSeconds(i));
If a remote service is truly DOWN, retrying 1,000 times will just slow down your server and overwhelm the remote service. The Circuit Breaker "Trips" and stops all calls immediately for 30 seconds, allowing the remote service time to recover and giving your users a faster "Service Unavailable" response.
Q: "What is Exponential Backoff, and why is it better than a simple loop?"
Architect Answer: "Exponential Backoff is the practice of increasing the wait time between retries (e.g., 1s, 2s, 4s, 8s). This is superior to a simple loop because it prevents the 'Thundering Herd' problem. If 1,000 servers all retry a failed database every 1 second, they will essentially DDOS the database and prevent it from ever recovering. By 'Backing off', we give the resource the breathing room it needs to heal."