Rate Limiting & Throttling: Protecting your services

8 min read Updated 6/24/2026

On this page

Rate Limiting & Throttling

A single buggy script or a malicious attacker can crash your entire cluster by making 10,000 requests per second. Rate Limiting ensures that each user/client is restricted to a fair amount of traffic. It is the "Bodyguard" of your microservices.

1. Fixed Window vs Token Bucket

Fixed Window: "Allow 100 requests every 60 seconds." (Simple, but can lead to bursts at the end of the minute).
Token Bucket: Users get 'tokens' at a steady rate. They can save up tokens for a small burst, but once the bucket is empty, they are blocked. (Modern and Fair).

2. Implementing in .NET

With .NET 7+, Rate Limiting is built into the framework. You can define global policies in Program.cs or per-endpoint policies using attributes.

app.UseRateLimiter(); // Native .NET Rate Limiting Middleware

4. Interview Mastery

Q: "What is the difference between Rate Limiting and Throttling?"

Architect Answer: "The difference is how the system handles the overflow. **Rate Limiting** is hard: once you hit the limit, you get a '429 Too Many Requests' error immediately. **Throttling** is soft: the system starts to slow down the response time for that user (queueing the requests) to discourage them from sending more. Rate limiting is for protection; Throttling is for traffic management."

Rate Limiting & Throttling

1. Fixed Window vs Token Bucket

2. Implementing in .NET

4. Interview Mastery

Microservices Mastery