Introduction
Prometheus is a widely used, cloud-native open source monitoring solution. Its alerting component is called Alertmanager, which can send alerts to email, Slack and elsewhere. Both Prometheus and Alertmanager fit very well into the infrastructure as code model as well. Twilio SMS is a SMS sending service with consumption-based pricing model with an extensive API. With some effort you can integrate Twilio SMS and Prometheus together as described in this article.
Our particular challenge was that we did not notice emails or instant messages quickly enough: we work in pomodoros and keep tight focus on development work. As a result we can't react to every beep, 99% of which are not urgent. In addition most of the day we keep our mobiles silent and disconnected from Internet - except for SMS messages, (important) phone calls and the "between the 25 minute pomodoros" times.
This is where Twilio SMS and Prometheus Alertmanager integration comes in: it allows us to get critical alerts as SMS without having to pay attention to the constant flood of low priority email and IM messages.
Promtotwilio: the glue between Twilio SMS and Prometheus
We were lucky in that a person (Gael Gillard from Belgium, to be exact) had already written a simple Go web application called promtotwilio that forwards Alertmanager alerts to Twilio SMS. Promtotwilio acts as an Alertmanager webhook and handles parsing the alert, converting it to simplified SMS format and sending SMS messages via Twilio SMS API.
Twilio SMS had a few limitations that we needed to fix (custom listen port, multiple receivers), but otherwise it was a solid piece of work. Promtotwilio takes all its parameters via environment variables. We saved these into an environment variable file for systemd (/etc/promtotwilio.conf) with appropriately locked down permissions:
SID="8a31e06d74cd8bb08e7fc7b71ad918832f"
TOKEN="6f87636daece98c2f9a953d5bda3df2f"
SENDER="+12345678901"
RECEIVER="+358409898981,+358406060601"
PORT=9191
You can obtain the SID and TOKEN from Twilio. The SENDER must match your "Twilio phone number", which is basically a virtual phone number the SMS messages appear to come from.
NOTE 1: If your Twilio SMS account is in trial mode you need to separately add and verify each phone number listed in RECEIVERS.
NOTE 2: customizing the PORT and using multiple RECEIVERS requires patches to promtotwilio (PRs here and here). We use our own podman-builder to create patched promtotwilio builds.
As we don't run promtotwilio as a container we created a simple systemd service unit ("/etc/systemd/system/promtotwilio.service") for it:
[Unit]
Description=Send Prometheus alerts as SMS via Twilio SMS API
After=network.target
[Service]
Type=simple
User=root
EnvironmentFile=/etc/promtotwilio.conf
ExecStart=/usr/local/bin/promtotwilio
[Install]
WantedBy=multi-user.target
To enable and start the unit we did:
systemctl daemon-reload
systemctl enable promtotwilio
systemctl start promtotwilio
If your environment file does not have any errors then promtotwilio should now be running:
root@prometheus:~# systemctl status promtotwilio
● promtotwilio.service - Send Prometheus alerts as SMS via Twilio SMS API
Loaded: loaded (/etc/systemd/system/promtotwilio.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-02-16 10:11:37 UTC; 2h 49min ago
Main PID: 2227451 (promtotwilio)
Tasks: 6 (limit: 4556)
Memory: 3.7M
CGroup: /system.slice/promtotwilio.service
└─2227451 /usr/local/bin/promtotwilio
Feb 16 10:11:37 prometheus.puppeteers.in systemd[1]: Started Send Prometheus alerts as SMS via Twilio SMS API.
Configuring Alertmanager routing
Using Twilio SMS and Prometheus together can become fairly costly soon if you have lots of alert noise (one SMS is ~$0,08). Therefore you should send only critical alerts as SMS. You can accomplish this by setting appropriate alert labels in /etc/alertmanager/alert.rules:
- name: robot.rules
rules:
- alert: RobotTestFailure
expr: robot_failed_total > 0
for: 1m
labels:
severity: critical
annotations:
summary: Robot - "{{ $labels.test_app }}" has failures
description: 'Test "{{ $labels.test_name }}", ID: "{{ $labels.test_id }}"'
Here we set label "severity" to value "critical" in case of this particular Robot Framework-based test.
Once you have labeled your alert rules in Prometheus you can route alerts based on those labels:
route:
group_by:
- alertname
- cluster
- service
group_wait: 30s
group_interval: 5m
repeat_interval: 3h
receiver: email
routes:
- match_re:
severity: ".*"
receiver: email
continue: true
- match:
severity: critical
receiver: sms
continue: true
receivers:
- name: email
email_configs:
- to: [email protected]
from: [email protected]
smarthost: mail.example.org:25
auth_username: [email protected]
auth_identity: [email protected]
auth_password: supersecret
- name: sms
webhook_configs:
- url: http://127.0.0.1:9191/send
The receivers section defines the potential alerting targets. In this case we have two, email and sms. The webhook URL for sms should point to a running promtotwilio instance.
The routes section defines the rules by which those receivers are selected. With the above config alerts you route alerts of any severity to email. The continue: true ensures that Alertmanager does not stop after sending the email, and instead goes to the next match. Then, if severity == critical it will additionally send an email.
To make these changes stick you need to restart prometheus and alertmanager services:
systemctl restart prometheus alertmanager
If all goes well you should be able to receive alerts as SMS now.
Resources for Twilio SMS and Prometheus integration
- Glue between Twilio SMS and Prometheus: promtotwilio
- Tool for building promtotwilio: podman-builder
- Puppet module for managing promtotwilio: puppet-promtotwilio
- Testing AlertManager webhooks with curl