Tutorials System Design Mastery
Latency vs Throughput: Optimizing for the right metric
On this page
Latency & Throughput
Architects don't just say "Performance." They talk about Latency and Throughput. These two metrics are often at odds with each other.
1. Latency (The Delay)
The time it takes for a single request to be completed. Measured in milliseconds (ms). Goal: Make it feel instant for a single user (e.g., searching for a movie).
2. Throughput (The Volume)
The number of requests per second (RPS) the system can handle. Goal: Handle the load of 1 million users simultaneously (e.g., serving movie streams).
3. The Trade-off
If you add heavy compression to a file, you increase **Latency** (CPU time to compress) but you increase **Throughput** (smaller data means you can send more files over the same network bandwidth). A great architect knows which one is more important for the current business case.
4. Interview Mastery
Q: "How do you improve Latency on a Global scale?"
Architect Answer: "We use an **Edge Strategy**. By deploying **CDNs** (Content Delivery Networks) and **Edge Functions**, we move the data and logic physically closer to the user. If the request only has to travel 50 miles to an edge server instead of 5,000 miles to a central data center, the speed of light alone provides a massive reduction in latency."
Sign in to ask a question or upvote helpful answers.
No questions yet — be the first to ask!