Tutorials DevOps & Cloud Architect Mastery
Logs & Metrics: Setting up ELK and Prometheus in the cloud
On this page
Cloud Observability
You can't manage what you can't measure. Prometheus for metrics and ELK (Elasticsearch, Logstash, Kibana) for logs are the standard tools for seeing inside your cluster.
1. Prometheus & Grafana
Prometheus "Pulls" metrics from your apps every few seconds. It stores them as time-series data. **Grafana** then builds beautiful dashboards that show the health of your CPUs, RAM, and Latency. It can even predict when you will run out of disk space!
2. Centralized Logging (ELK)
In a cluster of 100 servers, you can't SSH into each one to read logs. ELK streams all logs to a central database where you can search, filter, and alert. If one user says "My checkout failed," you can find their exact error in seconds across all 100 servers.
4. Interview Mastery
Q: "What is 'White-box' vs 'Black-box' monitoring?"
Architect Answer: "**White-box** comes from INSIDE the app (e.g., Prometheus metrics, logs). It shows you WHY it is slow (e.g., 'DB query taking 2s'). **Black-box** comes from OUTSIDE (e.g., an external ping). It only tells you IF the app is up or down. A professional architect uses both: Black-box to alert on downtime, and White-box to diagnose the cause."