Tutorials System Design Mastery
Zero Downtime Deployments: Blue-Green vs Canary
On this page
Zero Downtime Strategies
Users expect 100% uptime. You can't take the site down for "Maintenance." You must be able to deploy new code while users are actively buying products.
1. Blue-Green Deployment
You have two identical environments. **Blue** is live. You deploy the new code to **Green** (private). You test it. If perfect, you flip the switch on the Load Balancer. All traffic now goes to Green. If a bug is found, you flip back to Blue instantly. It is the safest method but requires 2x the infrastructure cost.
2. Canary Deployment
You roll out the new code to only 1% of users. You monitor the error logs. If the 1% is happy, you go to 10%, then 50%, then 100%. This is how **Facebook and Netflix** test risky new features without the risk of a global outage.
3. Rolling Update
Slowly replace one server at a time until the whole cluster is updated. This is the default method for **Kubernetes**.
4. Interview Mastery
Q: "How do you handle Database migrations during a zero-downtime deployment?"
Architect Answer: "Database migrations must be **Add-Only/Non-destructive**. You never rename or delete a column. You first Add the new column (DB migration), then deploy the code that writes to BOTH columns, then deploy code that reads from the NEW column, and finally (weeks later) you delete the old column. This is called the **Expand and Contract** pattern. It is the only way to avoid the 'Missing Column' error when two versions of your app are running simultaneously during a deployment roll-out."
Sign in to ask a question or upvote helpful answers.
No questions yet — be the first to ask!