Tutorials System Design Mastery

Latency vs Throughput: Optimizing for the right metric

On this page

Latency & Throughput

Architects don't just say "Performance." They talk about Latency and Throughput. These two metrics are often at odds with each other.

1. Latency (The Delay)

The time it takes for a single request to be completed. Measured in milliseconds (ms). Goal: Make it feel instant for a single user (e.g., searching for a movie).

2. Throughput (The Volume)

The number of requests per second (RPS) the system can handle. Goal: Handle the load of 1 million users simultaneously (e.g., serving movie streams).

3. The Trade-off

If you add heavy compression to a file, you increase **Latency** (CPU time to compress) but you increase **Throughput** (smaller data means you can send more files over the same network bandwidth). A great architect knows which one is more important for the current business case.

4. Interview Mastery

Q: "How do you improve Latency on a Global scale?"

Architect Answer: "We use an **Edge Strategy**. By deploying **CDNs** (Content Delivery Networks) and **Edge Functions**, we move the data and logic physically closer to the user. If the request only has to travel 50 miles to an edge server instead of 5,000 miles to a central data center, the speed of light alone provides a massive reduction in latency."

Questions on this lesson 0

Sign in to ask a question or upvote helpful answers.

No questions yet — be the first to ask!

System Design Mastery
Course syllabus
1. Distributed Systems Fundamentals
2. Database Scalability
3. Caching & CDN Strategies
4. Event-Driven Architecture
5. High Availability & Load Balancing
6. Microservices & API Gateway
7. Monitoring & Disaster Recovery
8. FAANG System Design Interview
Toolliyo Assistant
Ask about tutorials, ebooks, training, pricing, mentor services, and support. I use public site content only—not admin or internal tools.

care@toolliyo.com

Need callback? Share your details