Tools to monitore your validator
Special thanks for important parts @p1xel32
Hello friends! I consider it necessary to make a general analysis of useful software which you will need to make the validator always work.
Here I will present how monitoring is performed using 3 utilities:
1st Part - Prometheus
2nd Part - Grafana cloud
3d Part - Node exporter
4th Part - Dashborad setting up
5th Part - Conclusion
Update dependencies
sudo apt update && sudo apt upgrade -y
sudo apt install nano
Setup Prometheus
1 Create a dedicated user and group for Prometheus on your server
groupadd --system prometheus
useradd -s /sbin/nologin --system -g prometheus prometheus
1.2 Download the latest version of Prometheus
wget
https://github.com/prometheus/prometheus/releases/download/v2.51.2/prometheus-2.51.2.linux-amd64.tar.gz
1.3 Extract Prometheus
tar -xvf prometheus*.tar.gz
1.4 Change the directory to the extracted directory
cd prometheus-2.51.2.linux-amd64
1.5 Create some required directories
mkdir /etc/prometheus
mkdir /var/lib/prometheus
1.6 Copy the required files
mv prometheus.yml /etc/prometheus/prometheus.yml
mv consoles/ console_libraries/ /etc/prometheus/
mv prometheus promtool /usr/local/bin/
1.7 Create a systemd service file
sudo tee /etc/systemd/system/prometheus.service > /dev/null <<EOF
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.external-url=
SyslogIdentifier=prometheus
Restart=always
[Install]
WantedBy=multi-user.target
EOF
1.8 Set proper ownership and permission to the Prometheus directory
sudo chown -R prometheus:prometheus /etc/prometheus/
sudo chmod -R 775 /etc/prometheus/
sudo chown -R prometheus:prometheus /var/lib/prometheus/
Setup Grafana Cloud
2 Create account and api keys grafana free service
https://grafana.com/auth/sign-up/
2.1 Head over to your Grafana Cloud Portal and select Send Metrics on Prometheus. If you scroll above, you should see the section for API Key.
Click on Generate now and create an API Key with the Role MetricsPublisher. Copy the Prometheus config and save it locally. The url and username should be unique for every user. The password in both snippet should be filled with your API key.


2.2 Change prometheus config change url, password and username in config


nano /etc/prometheus/prometheus.yml
Change 5 lines by yours (origin_prometheus, url, username, password, job_name exporter targets)
# Sample config for Prometheus.
global:
scrape_interval: 60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 60s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'example'
origin_prometheus: <AnyName>
remote_write:
- url: https://prometheus-prod-12-prod-us-central-4.grafana.net/api/prom/push
basic_auth:
username: 77777
password: AOHSDJASHDKASDUhkasjdhauKSADHausdhaskj
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093']
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
#- "first_rules.yml"
#- "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label 'job=<job_name>' to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 60s
scrape_timeout: 60s
#metrics_path defaults to '/metrics'
#scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: 'exporter'
# If prometheus-node-exporter is installed, grab stats about the local
# machine by default.
static_configs:
- targets: ['localhost:9100']
- job_name: <AnyName>
static_configs:
- targets: ['localhost:9101']
2.3 Run prometheus
systemctl daemon-reload
systemctl start prometheus
systemctl enable prometheus
Setup Node Exporter
3 Download node_exporter
wget
https://github.com/prometheus/node_exporter/releases/download/v1.8.0/node_exporter-1.8.0.linux-amd64.tar.gz
3.2 Extract Node Exporter
tar -xvzf node_exporter-1.8.0.linux-amd64.tar.gz
3.3 Move the extracted directory to the /etc/prometheus/
mv node_exporter-1.8.0.linux-amd64 /etc/prometheus/node_exporter
3.4 Set proper ownership
sudo chown -R prometheus:prometheus /etc/prometheus/node_exporter
3.5 Create a systemd service file
sudo tee /etc/systemd/system/node_exporter.service > /dev/null <<EOF
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
ExecStart=/etc/prometheus/node_exporter/node_exporter
[Install]
WantedBy=default.target
EOF
3.6 Run Node exporter
systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter
Dashboard setting up

Now go to grafana.net → dashboard → import dashboard → import your desired dashboard + you can import exporter dashboard with the detailed server info for example 11074.
Also in that dashboard you can add any statistic about your node which was collected by prometheus.
Useful commands:
Check status
systemctl status prometheus
systemctl status node_exporter
Stop prometheus and exporter
systemctl stop prometheus && systemctl disable prometheus
systemctl stop node_exporter && systemctl disable node_exporter
That’s all you need to monitor your node - please remember that alerts is really important part as well since need to instantly react on what’s happening on logs. I hope that guide was helpful for you to understand what tools do you need to be aware of your validator health. Enjoy your day!
Last updated