Tutorials Microservices Mastery
Rate Limiting & Throttling: Protecting your services
On this page
Rate Limiting & Throttling
A single buggy script or a malicious attacker can crash your entire cluster by making 10,000 requests per second. Rate Limiting ensures that each user/client is restricted to a fair amount of traffic. It is the "Bodyguard" of your microservices.
1. Fixed Window vs Token Bucket
- Fixed Window: "Allow 100 requests every 60 seconds." (Simple, but can lead to bursts at the end of the minute).
- Token Bucket: Users get 'tokens' at a steady rate. They can save up tokens for a small burst, but once the bucket is empty, they are blocked. (Modern and Fair).
2. Implementing in .NET
With .NET 7+, Rate Limiting is built into the framework. You can define global policies in Program.cs or per-endpoint policies using attributes.
app.UseRateLimiter(); // Native .NET Rate Limiting Middleware
4. Interview Mastery
Q: "What is the difference between Rate Limiting and Throttling?"
Architect Answer: "The difference is how the system handles the overflow. **Rate Limiting** is hard: once you hit the limit, you get a '429 Too Many Requests' error immediately. **Throttling** is soft: the system starts to slow down the response time for that user (queueing the requests) to discourage them from sending more. Rate limiting is for protection; Throttling is for traffic management."