An unprotected API is a target for Denial of Service (DoS) attacks. You must implement Rate Limiting to ensure that a single user cannot crash your server by sending 10,000 requests a second.
This is the most flexible approach. You give each user a 'Bucket' of tokens. Every request consumes a token. The bucket 'Refills' at a steady rate. This allows users to burst occasionally but prevents sustained high-volume abuse.
Always implement **Global Throttling** to protect your database, and **User-Specific Throttling** (based on API Key or IP) to ensure fair usage among your customers.
Q: "What is 'Distributed Rate Limiting'?"
Architect Answer: "Memory-based rate limiting only works if you have 1 server. If you have 10 servers behind a Load Balancer, a user could send 100 requests to EACH server, bypassing your limit. We solve this by using **Redis** to store the rate limit counters centrally. All 10 servers check the same Redis key, ensuring the user is capped at 100 total requests regardless of which server they hit."