In a previous blog post, we discussed how to set up container and host metrics monitoring using cAdvisor, Node Exporter, and Prometheus. Now, let’s take it a step further by implementing an alerting system with Prometheus Alertmanager. The goal is to receive timely notifications about the system’s health and performance.
Understanding Alertmanager
What is Alertmanager?
Alertmanager is a component of the Prometheus ecosystem responsible for handling alerts generated by Prometheus servers. It takes care of grouping, deduplicating, and routing alerts to different receivers, making the alerting process more manageable and configurable.
Key Features of Alertmanager:
-
Alert Grouping: Alertmanager groups similar alerts, preventing a flood of notifications for related issues. This ensures that you receive a concise and meaningful overview of the problem.
-
Deduplication: Duplicate alerts are filtered out to avoid redundancy, providing a clear and focused notification stream.
-
Silence Notifications: Alertmanager allows you to silence specific alerts for a defined period. This is useful during maintenance periods or when you’re aware of a temporary issue.
-
Rich Integrations: It supports various integrations, allowing you to send alerts to popular communication platforms like Slack, email, PagerDuty, and more.
-
Templating: Alertmanager supports templates, enabling customization of alert notifications according to your preferences.
Configuring Alertmanager
To integrate Alertmanager into your Prometheus setup, you need to create an alertmanager.yml file
. This configuration file specifies how Alertmanager should handle alerts and where to send them.
Here’s a sample alertmanager.yml
file:
global:
resolve_timeout: 1m
slack_api_url: 'add slack webhook url'
route:
receiver: 'slack-notifications'
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#testing-webhook'
send_resolved: true
title: |-
[:] for (, ="")
text: >-
*Alert:* - ``
*Description:*
*Details:*
• *:* ``
This configuration directs Alertmanager to send alerts to a Slack channel named #testing-webhook
. Adjust the configuration based on your preferred channels and integrations.
You can find about creating slack incoming webhook using this link
Prometheus Configuration
Update your prometheus.yml
file to include the Alertmanager configuration and link the alert rules file:
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 15s
rule_files:
- 'alert.rules'
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager:9093"
scrape_configs:
- job_name: cadvisor
honor_timestamps: true
scheme: http
static_configs:
- targets:
- cadvisor:8080
relabel_configs:
- source_labels: [__address__]
regex: '.*'
target_label: instance
replacement: 'cadv'
- job_name: nodeexporter
honor_timestamps: true
scheme: http
static_configs:
- targets:
- nodeexporter:9100
relabel_configs:
- source_labels: [__address__]
regex: '.*'
target_label: instance
replacement: 'node'
Alert Rules
Include the alert rules in a file named alert.rules
:
groups:
- name: example
rules:
- alert: HostOutOfMemory
expr: (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 20) * on(instance) group_left (nodename) node_uname_info{nodename=~".+"}
for: 2m
labels:
severity: warning
annotations:
summary: Host out of memory (instance )
description: "Node memory is filling up (< 20% left)\n VALUE = \n LABELS = "
- alert: ContainerHighMemoryUsage
expr: (sum(container_memory_working_set_bytes{name!=""}) BY (instance, name) / sum(container_spec_memory_limit_bytes > 0) BY (instance, name) * 100) > 80
for: 2m
labels:
severity: warning
annotations:
summary: Container High Memory usage (instance )
description: "Container Memory usage is above 80%\n VALUE = \n LABELS = "
For more alerting ideas and configurations, check out the Awesome Prometheus Alerts repository.
Bringing It All Together
Integrate the Alertmanager service into your Docker Compose setup by adding the following to your docker-compose.yml file. We use the same docker compose file posted in this blog post.
services:
alertmanager:
image: prom/alertmanager:v0.26.0
ports:
- 9093:9093
volumes:
- ./config/alertmanager.yml:/config/alertmanager.yml
- alertmanager-data:/data
restart: unless-stopped
command:
- '--config.file=/config/alertmanager.yml'
With these configurations in place, your Prometheus setup now has the ability to intelligently notify you when specific conditions trigger alerts.
Testing the Setup
Run your Docker Compose setup and monitor the system. Alerts configured in the alert.rules
file should trigger notifications in your specified Slack channel through Alertmanager.
You can access alertmanager using localhost:9093
Congratulations! You’ve successfully implemented Alertmanager for Prometheus alerting. For more advanced configurations and options, refer to the official Alertmanager documentation.
If you have any questions or encounter issues, feel free to reach out. Happy monitoring!