Simplifying Prometheus Alerting with Alertmanager

In a previous blog post, we discussed how to set up container and host metrics monitoring using cAdvisor, Node Exporter, and Prometheus. Now, let’s take it a step further by implementing an alerting system with Prometheus Alertmanager. The goal is to receive timely notifications about the system’s health and performance.

Understanding Alertmanager

What is Alertmanager?

Alertmanager is a component of the Prometheus ecosystem responsible for handling alerts generated by Prometheus servers. It takes care of grouping, deduplicating, and routing alerts to different receivers, making the alerting process more manageable and configurable.

Key Features of Alertmanager:

Alert Grouping: Alertmanager groups similar alerts, preventing a flood of notifications for related issues. This ensures that you receive a concise and meaningful overview of the problem.
Deduplication: Duplicate alerts are filtered out to avoid redundancy, providing a clear and focused notification stream.
Silence Notifications: Alertmanager allows you to silence specific alerts for a defined period. This is useful during maintenance periods or when you’re aware of a temporary issue.
Rich Integrations: It supports various integrations, allowing you to send alerts to popular communication platforms like Slack, email, PagerDuty, and more.
Templating: Alertmanager supports templates, enabling customization of alert notifications according to your preferences.

Configuring Alertmanager

To integrate Alertmanager into your Prometheus setup, you need to create an alertmanager.yml file. This configuration file specifies how Alertmanager should handle alerts and where to send them.

Here’s a sample alertmanager.yml file:

global:
  resolve_timeout: 1m
  slack_api_url: 'add slack webhook url'

route:
  receiver: 'slack-notifications'

receivers:
- name: 'slack-notifications'
  slack_configs:
  - channel: '#testing-webhook'
    send_resolved: true
    title: |-
     [:]  for  (, ="")
    text: >-
     *Alert:*  - ``

     *Description:* 

     *Details:*
        • *:* ``
       
     

This configuration directs Alertmanager to send alerts to a Slack channel named #testing-webhook. Adjust the configuration based on your preferred channels and integrations. You can find about creating slack incoming webhook using this link

Prometheus Configuration

Update your prometheus.yml file to include the Alertmanager configuration and link the alert rules file:

global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
  
rule_files:
  - 'alert.rules'

alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "alertmanager:9093"

scrape_configs:
- job_name: cadvisor
  honor_timestamps: true
  scheme: http
  static_configs:
  - targets:
    - cadvisor:8080
  relabel_configs:
    - source_labels: [__address__]
      regex: '.*'
      target_label: instance
      replacement: 'cadv'

- job_name: nodeexporter
  honor_timestamps: true
  scheme: http
  static_configs:
  - targets:
    - nodeexporter:9100
  relabel_configs:
    - source_labels: [__address__]
      regex: '.*'
      target_label: instance
      replacement: 'node'

Alert Rules

Include the alert rules in a file named alert.rules:

groups:
- name: example
  rules:

  - alert: HostOutOfMemory
    expr: (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 20) * on(instance) group_left (nodename) node_uname_info{nodename=~".+"}
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Host out of memory (instance )
      description: "Node memory is filling up (< 20% left)\n  VALUE = \n  LABELS = "
   
  - alert: ContainerHighMemoryUsage
    expr: (sum(container_memory_working_set_bytes{name!=""}) BY (instance, name) / sum(container_spec_memory_limit_bytes > 0) BY (instance, name) * 100) > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Container High Memory usage (instance )
      description: "Container Memory usage is above 80%\n  VALUE = \n  LABELS = "

For more alerting ideas and configurations, check out the Awesome Prometheus Alerts repository.

Bringing It All Together

Integrate the Alertmanager service into your Docker Compose setup by adding the following to your docker-compose.yml file. We use the same docker compose file posted in this blog post.

services:
  alertmanager:
    image: prom/alertmanager:v0.26.0
    ports:
      - 9093:9093
    volumes:
      - ./config/alertmanager.yml:/config/alertmanager.yml
      - alertmanager-data:/data
    restart: unless-stopped
    command:
      - '--config.file=/config/alertmanager.yml'

With these configurations in place, your Prometheus setup now has the ability to intelligently notify you when specific conditions trigger alerts.

Testing the Setup

Run your Docker Compose setup and monitor the system. Alerts configured in the alert.rules file should trigger notifications in your specified Slack channel through Alertmanager.

You can access alertmanager using localhost:9093

Congratulations! You’ve successfully implemented Alertmanager for Prometheus alerting. For more advanced configurations and options, refer to the official Alertmanager documentation.

If you have any questions or encounter issues, feel free to reach out. Happy monitoring!

Tags: Alertmanager Prometheus Grafana Monitoring Alerting