The Guide To Prometheus Alerting

Alerting is an essential aspect of infrastructure and application monitoring. With alerts, you can quickly know when something is amiss and respond immediately -- often even before the performance issue impacts your user experience in a production environment. In this article, we’ll take a look at how to configure Prometheus alerts for a Kubernetes environment. 

Prometheus Alerting

About Alertmanager

Prometheus alerting is powered by Alertmanager. Prometheus forwards its alerts to Alertmanager for handling any silencing, inhibition, aggregation, or sending of notifications across your platforms or event management systems of choice. Alertmanager also takes care of deduplicating and grouping, which we’ll go over in the following sections. 

You can run Alertmanager either inside or outside of your cluster. For high availability, you can also run multiple Alertmanagers, which is strongly recommended for production systems. 

Alerting Concepts 

There are three important concepts to familiarize yourself with when using Alertmanager to configure alerts: 

  • Grouping: You can group alerts into categories  (e.g., node alerts, pod alerts)
  • Inhibition: You can dedupe alerts when similar alerts are firing to avoid spam 
  • Silencing: You can mute alerts based on labels or regular expressions

Rule File Components

A rule file uses the following basic template:

groups:
  - name:
  rules:
  - alert:
  expr:
  for:
  labels:
  annotations:
  • Groups: A collection of rules that are run sequentially at a regular interval
  • Name:  Name of group
  • Rules: The rules in the group
  • Alert: A valid label name
  • Expr: The condition required for the alert to trigger
  • For: The minimum duration for an alert’s expression to be true (active) before updating to a firing status  
  • Labels: Any additional labels attached to the alert 
  • Annotations: A way to communicate actionable or contextual information such as a description or runbook link

Recording Intervals

Recording intervals is a setting that defines how long a threshold must be met before an alert enters a firing state. If the recording interval is too low, you might get notified for small blips in metric changes (known as false positives or noise); if the recording interval is too long, then you may not be able to solve the performance issue in time to minimize damage.

Let’s look at an example comparing two alerting rules, where the recording interval can be found as a value associated to the “for” input: 

groups: 
  - name: ElasticSearch Alerts 
    rules: 
    - alert: ElasticSearch Status RED 
      annotations: 
        summary: "ElasticSearch Status is RED (cluster {{ $labels.cluster }})" 
        description: "ES Health is Critical\nCluster Name: {{$externalLabels.cluster}}"  
      expr: | 
         elasticsearch_cluster_health_status{color="red"}==1 
      for: 3m 
      labels: 
        team: devops 

    - alert: ElasticSearch Status YELLOW 
      annotations: 
        summary: "ElasticSearch Status is YELLOW (cluster {{ $labels.cluster }})" 
        description: "ES Health is Critical\nCluster Name: {{$externalLabels.cluster}}"  
      expr: | 
         elasticsearch_cluster_health_status{color="yellow"}==1 
      for: 10m 
      labels: 
        team: slack 
  • Elasticsearch Status RED:  3 minutes
  • Elasticsearch Status YELLOW:  10 minutes 

In this example, you can see that a more severe alert has a lower threshold requirement than a warning alert. Later in this article, we show the screenshot of such an alert notification displayed in a Slack channel.

Alert States

There are three main alert states to know: 

  • Pending:  The time elapsed since threshold breach is less than the recording interval
  • Firing: The time elapsed since threshold breach is more than the recording interval and alertmanager is delivering notifications
  • Resolved: The alert is no longer firing because the threshold is no longer exceeded; you can optionally enable resolution notifications using `[ send_resolved: <boolean> | default = true ]` in your config file

11 Kubernetes Alerts to Know 

The following sections are a single configuration file broken down into individual alerts that we recommend setting up. This is not an exhaustive list, and you can always add or remove alerts to best suit your needs. 

1. Deployment at 0 Replicas 

This alert triggers when your deployment has no replica pods running. 

groups: 
  - name: Deployment groups: 
    rules: 
    - alert: Deployment at 0 Replicas 
      annotations: 
        summary: Deployment {{$labels.deployment}} is currently having no pods running 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
 {{$labels.namespace}}\nDeployment name: {{$labels.deployment}}\n" 
      expr: | 
        sum(kube_deployment_status_replicas{pod_template_hash=""}) by (deployment,namespace)  < 1 
      for: 1m 
      labels: 
        team: devops 
    rules: 
    - alert: Deployment at 0 Replicas 
      annotations: 
        summary: Deployment {{$labels.deployment}} is currently having no pods running 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
 {{$labels.namespace}}\nDeployment name: {{$labels.deployment}}\n"  
      expr: | 
        sum(kube_deployment_status_replicas{pod_template_hash=""}) by (deployment,namespace)  < 1 
      for: 1m 
      labels: 
        team: devops
2. HPA Scaling Limited

This alert triggers when the horizontal pod autoscaler is unable to scale your deployment.  

- alert: HPA Scaling Limited 
        annotations: 
        summary: HPA named {{$labels.hpa}} in {{$labels.namespace}} namespace has reached scaling limited state 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
{{$labels.namespace}}\nHPA name: {{$labels.hpa}}\n"  
      expr: | 
        (sum(kube_hpa_status_condition{condition="ScalingLimited",status="true"}) by  
(hpa,namespace)) == 1 
      for: 1m 
      labels:  
        team: devops
3. HPA at Max Capacity 

This alert triggers when a horizontal pod autoscaler is running at max capacity.

 - alert: HPA at MaxCapacity 
      annotations: 
        summary: HPA named {{$labels.hpa}} in {{$labels.namespace}} namespace is running at Max Capacity 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
{{$labels.namespace}}\nHPA name: {{$labels.hpa}}\n" 
      expr: | 
        ((sum(kube_hpa_spec_max_replicas) by (hpa,namespace)) - 
(sum(kube_hpa_status_current_replicas) by (hpa,namespace))) == 0 
      for: 1m 
      labels: 
        team: devops
4. Container Restarted 

This alert triggers when a container inside a pod restarted due to either an Error, an OOMKill, or a process completion.

 - name: Pods 
    rules: 
    - alert: Container restarted 
      annotations: 
        summary: Container named {{$labels.container}} in {{$labels.pod}} in {{$labels.namespace}} was restarted 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
{{$labels.namespace}}\nPod name: {{$labels.pod}}\nContainer name: {{$labels.container}}\n" 
      expr: | 
        sum(increase(kube_pod_container_status_restarts_total{namespace!="kube-system",pod_template_hash=""}[1m])) by (pod,namespace,container) > 0 
      for: 0m 
      labels: 
        team: slack
5. Too Many Container Restarts 

This alert triggers when a pod has gotten into a crashloop state.

   - alert: Too many Container restarts 
      annotations: 
        summary: Container named {{$labels.container}} in {{$labels.pod}} in {{$labels.namespace}} was restarted 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
{{$labels.namespace}}\nPod name: {{$labels.pod}}\nContainer name: {{$labels.container}}\n" 
       expr: | 
 
        sum(increase(kube_pod_container_status_restarts_total{namespace!="kube-system",pod_template_hash=""}[15m])) by (pod,namespace,container) > 5 
      for: 0m 
      labels: 
        team: dev
6. High Memory Usage of Container 

This alert triggers when a pod’s current memory usage is close to the memory Limit assigned to it.

   - alert: High Memory Usage of Container 
      annotations:  
        summary: Container named {{$labels.container}} in {{$labels.pod}} in {{$labels.namespace}} is using more than 80% of Memory Limit  
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace:  
{{$labels.namespace}}\nPod name: {{$labels.pod}}\nContainer name: {{$labels.container}}\n"       expr: |  
        ((( sum(container_memory_working_set_bytes{image!="",container!="POD", namespace!="kube-system"}) by (namespace,container,pod)  /  
sum(container_spec_memory_limit_bytes{image!="",container!="POD",namespace!="kube-system"}) by (namespace,container,pod) ) * 100 ) < +Inf ) > 80  
      for: 5m  
      labels:  
        team: dev
7. High CPU Usage of Container 

This alert triggers when a pod’s current CPU usage is close to the memory limit assigned to it.

   - alert: High CPU Usage of Container 
      annotations: 
        summary: Container named {{$labels.container}} in {{$labels.pod}} in {{$labels.namespace}} is using more than 80% of CPU Limit 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
{{$labels.namespace}}\nPod name: {{$labels.pod}}\nContainer name: {{$labels.container}}\n"       expr: | 
        ((sum(irate(container_cpu_usage_seconds_total{image!="",container!="POD", namespace!="kube-system"}[30s])) by (namespace,container,pod) / 
sum(container_spec_cpu_quota{image!="",container!="POD", namespace!="kube-system"} / 
container_spec_cpu_period{image!="",container!="POD", namespace!="kube-system"}) by (namespace,container,pod) ) * 100)  > 80 
      for: 5m 
      labels:  
        team: dev
8. High Persistent Volume Usage

This alert triggers when the persistent volume attached to a pod is nearly filled. 

 - alert: High Persistent Volume Usage 
      annotations: 
        summary: Persistent Volume named {{$labels.persistentvolumeclaim}} in {{$labels.namespace}} is using more than 60% used. 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNamespace: 
  {{$labels.namespace}}\nPVC name: {{$labels.persistentvolumeclaim}}\n" 
      expr: | 
        ((((sum(kubelet_volume_stats_used_bytes{})  by 
 (namespace,persistentvolumeclaim))  / 
(sum(kubelet_volume_stats_capacity_bytes{}) by 
 (namespace,persistentvolumeclaim)))*100) < +Inf ) > 60 
      for: 5m 
      labels: 
        team: devops
9. High Node Volume Usage

This alert triggers when a Kubernetes node is reporting high memory usage.

- name: Nodes 
    rules: 
    - alert: High Node Memory Usage 
      annotations: 
        summary: Node {{$labels.kubernetes_io_hostname}} has more than 80% memory used. Plan Capacity 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNode: 
{{$labels.kubernetes_io_hostname}}\n" 
      expr: | 
        (sum (container_memory_working_set_bytes{id="/",container!="POD"}) by  
(kubernetes_io_hostname) / sum (machine_memory_bytes{}) by (kubernetes_io_hostname) * 100) > 80 
      for: 5m 
      labels: 
        team: devops
10. High Node CPU Usage

This alert triggers when a Kubernetes node is reporting high CPU usage. 

- alert: High Node CPU Usage 
      annotations: 
        summary: Node {{$labels.kubernetes_io_hostname}} has more than 80% allocatable cpu used. Plan Capacity. 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNode: 
{{$labels.kubernetes_io_hostname}}\n" 
      expr: |
        (sum(rate(container_cpu_usage_seconds_total{id="/",
 container!="POD"}[1m])) by (kubernetes_io_hostname) / sum(machine_cpu_cores) by (kubernetes_io_hostname)  * 100) > 80 
      for: 5m 
      labels: 
        team: devops
11. High Node Disk Usage

This alert triggers when a Kubernetes node is reporting high disk usage.

  - alert: High Node Disk Usage 
      annotations: 
        summary: Node {{$labels.kubernetes_io_hostname}} has more than 85% disk used. Plan Capacity. 
        description: "\nCluster Name: {{$externalLabels.cluster}}\nNode: 
 {{$labels.kubernetes_io_hostname}}\n" 
      expr: | 

        (sum(container_fs_usage_bytes{device=~"^/dev/[sv]d[a-z][1-9]$",id="/",container!="POD"}) by (kubernetes_io_hostname) / 
 sum(container_fs_limit_bytes{container!="POD",device=~"^/dev/[sv]d[a-z][1-9]$",id="/"}) by (kubernetes_io_hostname)) * 100 > 85 
      for: 5m 
      labels: 
        team: devops

How to Configure Alertmanager Endpoints

Prometheus can be made aware of Alertmanager by adding Alertmanager endpoints to the Prometheus configuration file. If you have multiple instances of Alertmanager running for high availability, make sure all Alertmanager endpoints are provided. Note: Do not load balance traffic between Prometheus and multiple Alertmanager endpoints. The Alertmanager implementation expects all alerts to be sent to all Alertmanagers to ensure high availability.

A Prometheus configuration for Alertmanager looks something like this:

alerting: 

   # We want our alerts to be deduplicated 
   # from different replicas. 
   alert_relabel_configs: 
   - regex: prometheus_replica 
     action: labeldrop 

   alertmanagers: 
     - scheme: http 
       path_prefix: / 
       static_configs: 
         - targets: ['alertmanager:9093']

Notification Channels

Alertmanager integrates with a ton of notification providers such as Slack, VictorOps, and Pagerduty. It also has a webhook delivery system which you can integrate with any event management system. The complete list of supported notification channels can be found here

A typical Alertmanager configuration file may look something like this:

In this configuration file, we have defined two receivers named “devops” and “slack.”  Alerts are matched to each receiver based on the team label in the alert metric. In this particular case, the devops receiver delivers alerts to a Slack channel and Opsgenie on-call personnel. The slack receiver only delivers the alert to a Slack channel

global: 
   resolve_timeout: 5m 
   slack_api_url: "https://hooks.slack.com/services/" 

 templates: 
 - '/etc/alertmanager-templates/*.tmpl' 
 route: 
   group_by: ['alertname', 'cluster', 'service'] 
   group_wait: 10s 
   group_interval: 1m 
   repeat_interval: 5m   
   receiver: default  
   routes: 
   - match: 
       team: devops 
     receiver: devops 
     continue: true  
   - match: 
       team: slack 
     receiver: slack 
     continue: true 
 
 receivers:
 - name: 'default' 

 - name: 'devops' 
   opsgenie_configs: 
   - api_key:  
     priority: P1 
   slack_configs: 
   - channel: '#infra-alerts' 
     send_resolved: true 
     icon_url: https://avatars3.githubusercontent.com/u/3380462 
     text: " \nSummary: {{ .CommonAnnotations.summary }}\nDescription: {{  
.CommonAnnotations.description }} " 

 - name: 'slack' 
   slack_configs: 
   - channel: '#infra-alerts' 
     send_resolved: true 
     icon_url: https://avatars3.githubusercontent.com/u/3380462 
      text: " \nSummary: {{ .CommonAnnotations.summary }}\nDescription: {{ 
.CommonAnnotations.description }} " 

.

A typical Prometheus Alert Slack Notification

Best Practices

Use Labels for Severity and Paging

Alert notification channels are matched using labels. This ensures that you still get notified for each alert (for example, in a Slack channel as a public log) but that only the most important alerts prompt direct paging. 

Tune Alert Thresholds

Keeping alert thresholds too low or setting low recording intervals can lead to false positives. These settings should be fine-tuned based on your performance and usage patterns; expect an iterative process to discover what values work best for your infrastructure and application stack. 

Mute Alerts During Debugging

Avoid the noise (and panic) alerts can cause while you are debugging by temporarily muting your alerts. 

Assign Alerts Directly to Teams

Map your alerts to areas of responsibility definable by roles and teams. This makes ownership more clear and enables your teams to promptly handle any issues that arise. Alert mapping also allows your teams to trust when an alert is relevant to them, reducing the risk of an important alert going ignored. 

For more best practices, check out the official Alertmanager documentation

Limitations

1. Event Correlation 

Alertmanager provides one view of your environment and needs to be combined with other monitoring tools separately deployed to watch your full application stack. This can be accomplished using a centralized event correlation engine. For example, you may have a microservice hosted in a Kubernetes container that stops responding. The root cause may reside in your application code, a third-party API, public cloud services, or a database hosted in a private cloud with its own dedicated network and storage systems. A centralized event correlation engine would process events from all of your data sources and help isolate the root cause based on the sequence of event occurrences and the system dependencies.

2. Machine Learning

Alertmanager is a set of robust alerting rules and recording rules; however, it does not support machine learning for learning trends, accounting for seasonality, or detecting anomalies.  With hundreds of thousands of ephemeral monitoring endpoints, the use of machine learning (also known as AIOps) should be a required ingredient of any modern monitoring strategy.

3. Alert Management Workflow 

Alertmanager provides no way for a user to interact with an alert to acknowledge, prioritize, claim ownership, troubleshoot, and finally resolve the issue. Advanced incident management tools provide the ability to integrate Alertmanager via its API and offer an operational workflow management functionality designed for teamwork.

4. Automation

Alertmanager is most powerful when combined with an automation tool that would accept Prometheus events as input and use them to trigger run-books that would automate the action-taking required to resolve the underlying cause of the problem. For example, all of the simple and repetitive tasks (such as rebooting a system or increasing computing resources) raised from recurring problems (such as a server running out of memory or a hung process) would be taken care of without human interaction so that the operations team can focus on troubleshooting the more complex issues.  

Closing words

Prometheus alerting is a powerful tool that is free and cloud-native. Alertmanager makes it easy to organize and define your alerts; however, it is important to integrate it with other tools used to monitor your application stack by feeding its events into specialized tools that offer event correlation, machine learning, and automation functionality. 

Try OpsRamp for free