Latency vs Throughput: Optimizing for the right metric

Updated 6/27/2026

On this page

Latency & Throughput

Architects don't just say "Performance." They talk about Latency and Throughput. These two metrics are often at odds with each other.

1. Latency (The Delay)

The time it takes for a single request to be completed. Measured in milliseconds (ms). Goal: Make it feel instant for a single user (e.g., searching for a movie).

2. Throughput (The Volume)

The number of requests per second (RPS) the system can handle. Goal: Handle the load of 1 million users simultaneously (e.g., serving movie streams).

3. The Trade-off

If you add heavy compression to a file, you increase **Latency** (CPU time to compress) but you increase **Throughput** (smaller data means you can send more files over the same network bandwidth). A great architect knows which one is more important for the current business case.

4. Interview Mastery

Q: "How do you improve Latency on a Global scale?"

Architect Answer: "We use an **Edge Strategy**. By deploying **CDNs** (Content Delivery Networks) and **Edge Functions**, we move the data and logic physically closer to the user. If the request only has to travel 50 miles to an edge server instead of 5,000 miles to a central data center, the speed of light alone provides a massive reduction in latency."

Questions on this lesson 0

No questions yet — be the first to ask!

Latency & Throughput

1. Latency (The Delay)

2. Throughput (The Volume)

3. The Trade-off

4. Interview Mastery

System Design Mastery