普罗米修斯(Prometheus)是一个SoundCloud公司开源的监控系统。当年,由于SoundCloud公司生产了太多的服务,传统的监控已经无法满足监控需求,于是他们在2012年决定着手开发新的监控系统,即普罗米修斯。
Prometheus Node Exporter
在Prometheus的架构设计中,Prometheus Server并不直接服务监控特定的目标,其主要任务负责数据的收集,存储并且对外提供数据查询支持。因此为了能够能够监控到某些东西,如主机的CPU使用率,我们需要使用到Exporter。
下载二进制包
https://github.com/prometheus/node_exporter/releases
amd64
wget "https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz" -P /usr/local/node_exporter && cd /usr/local/node_exporter
arm64
wget "https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-arm64.tar.gz" -P /usr/local/node_exporter && cd /usr/local/node_exporter
解压
tar -xzvf node_exporter-1.6.1.linux-amd64.tar.gz && cd /usr/local/node_exporter/node_exporter-1.6.1.linux-amd64
创建一个bcrypt哈希密码作为验证
sudo apt install apache2-utils
输入下面的命令之后会让你输入两次密码,输入完成后会生成一个 hashed_password.txt
文件,里面就是哈希加密的密码
htpasswd -nBC 12 '' | tr -d ':\n' > hashed_password.txt
创建 web.yml
验证文件
touch web.yml
vim web.yml
按照下面的格式填写配置
admin
是用户名,冒号后面有一个空格,空格后面是哈希加密的密码
basic_auth_users:
admin: $2y$12$BvvKU3H/nM9e9NSK3GFjaOZT3KGXWWTVbCCF05ZNcFsf9xt3PtLb.
创建system进程守护
touch /usr/lib/systemd/system/node_exporter.service
vim /usr/lib/systemd/system/node_exporter.service
写入以下配置
[Unit]
Description=node_exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/node_exporter/node_exporter-1.6.1.linux-amd64/node_exporter --web.config.file=/usr/local/node_exporter/node_exporter-1.6.1.linux-amd64/web.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动
systemctl daemon-reload
systemctl enable node_exporter.service
systemctl start node_exporter.service
systemctl status node_exporter.service
访问,能成功访问且需要身份认证即为启动成功
http://ip:9100
Prometheus
下载二进制包
wget "https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz" -P /usr/local/prometheuscd && cd /usr/local/prometheuscd
解压
tar -xzvf prometheus-2.47.0.linux-amd64.tar.gz && cd /usr/local/prometheuscd/prometheus-2.47.0.linux-amd64
修改 prometheus.yml
在下面加入以下配置,注意格式对齐,不然会报错,password
填没有哈希加密过的密码,admin
自己改,localhost
改为IP,如果是本地部署了 Node Exporter
可以写 localhost
vim /usr/local/prometheuscd/prometheus-2.47.0.linux-amd64/prometheus.yml
- job_name: 'nodes'
basic_auth:
username: admin
password: password
static_configs:
- targets: ['localhost:9100']
创建普罗米修斯的身份验证
哈希密码可以用 Node Exporter
的,也可以重新创建
touch web.yml
vim web.yml
按照下面的格式填写配置
admin
是用户名,冒号后面有一个空格,空格后面是哈希加密的密码
basic_auth_users:
admin: $2y$12$BvvKU3H/nM9e9NSK3GFjaOZT3KGXWWTVbCCF05ZNcFsf9xt3PtLb.
创建system进程守护
touch /usr/lib/systemd/system/prometheus.service
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/prometheuscd/prometheus-2.47.0.linux-amd64/prometheus --config.file=/usr/local/prometheuscd/prometheus-2.47.0.linux-amd64/prometheus.yml --web.enable-lifecycle --storage.tsdb.path=/usr/local/prometheuscd/prometheus-2.47.0.linux-amd64/data --storage.tsdb.retention=60d --web.config.file=/usr/local/prometheuscd/prometheus-2.47.0.linux-amd64/web.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable prometheus.service
systemctl start prometheus.service
systemctl status prometheus.service
访问
能成功访问且需要身份认证即为启动成功
http://ip:9090
Gafana数据可视化
安装
Debian or Ubuntu
sudo apt-get install -y apt-transport-https software-properties-common wget
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
sudo apt-get update
sudo apt-get install grafana
systemctl enable grafana-server
systemctl start grafana-server
systemctl status grafana-server
访问
http://ip:3000
登录面板并配置Prometheus
添加Prometheus数据源
默认用户名密码为admin,在左侧菜单依次 Home -- Connections -- Data sources -- Prometheus
,填入IP:端口,部署在本地的可以用 localhost:9090
,下面的 Auth -- Basic auth
开启,填入用户名密码,然后拉到最下面 Save & Test
,
创建数据可视化面板
点击左上角的 +
,选择 import dashboard
,填入图中ID点击 Load
,或者去 https://grafana.com/grafana/dashboards/15172-node-exporter-for-prometheus-dashboard-based-on-11074/
下载json导入
完成,效果图
Prometheus unlock monitor
这是一个检测vps流媒体的一个普罗米修斯监控
项目地址
安装
bash <(curl -Ls unlock.moe/monitor) -service
使用
Usage of unlock-monitor:
-listen string
listen address (default ":9101")
-interval int
check interval (s) (default 60)
-service
setup systemd service
-hk
Hong Kong
-jp
Japan
-mul
Multination (default true)
-na
North America
-sa
South America
-tw
Taiwan
-u check update
-v show version
Prometheus job
添加 JOB:
- job_name: checkmedia
scrape_interval: 30s
static_configs:
- targets:
- <your ip/domain>:9101
- ...
命令参考
运行unlock-monitor,并更改监听端口为9102,检测区域hk
bash <(curl -Ls unlock.moe/monitor) -service -listen ":9102" -hk
garafana设置
右上角新建仪表板,然后添加可视化,接着按下图设置就好了
效果图
Prometheus blackbox_exporter
下载
wget "https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gz" -P /usr/local/blackbox_exporter && cd /usr/local/blackbox_exporter
解压
tar -xzvf blackbox_exporter-0.24.0.linux-amd64.tar.gz && cd /usr/local/blackbox_exporter/blackbox_exporter-0.24.0.linux-amd64
systemctl
touch /usr/lib/systemd/system/blackbox_exporter.service
vim /usr/lib/systemd/system/blackbox_exporter.service
[Unit]
Description=Blackbox Exporter Server
After=network-online.target
[Service]
User=root
Group=root
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter-0.24.0.linux-amd64/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox_exporter-0.24.0.linux-amd64/blackbox.yml
Restart=on-abort
[Install]
WantedBy=multi-user.target
启动
systemctl start blackbox_exporter
systemctl enable blackbox_exporter
#访问
127.0.0.1:9115
Prometheus job
icmp延迟监控
- job_name: 'ping_all'
scrape_interval: 5s
metrics_path: /probe
params:
module: [icmp] #ping
static_configs:
- targets: ['120.232.148.1']
labels:
group: '广州移动'
- targets: ['221.5.88.88']
labels:
group: '广州联通'
- targets: ['61.140.140.1']
labels:
group: '广州电信'
- targets: ['1.1.1.1']
labels:
group: 'cloudflare'
- targets: ['8.8.8.8']
labels:
group: 'Google'
relabel_configs:
- source_labels: [__address__]
regex: (.*)(:80)?
target_label: __param_target
replacement: ${1}
- source_labels: [__param_target]
regex: (.*)
target_label: ping
replacement: ${1}
- source_labels: []
regex: .*
target_label: __address__
replacement: 127.0.0.1:9115 # Blackbox exporter.
Grafana
点击创建新可视化面板
添加数据标签
根据实际参数修改表达式
-
数据源选择
-
icmp延迟表达式
probe_icmp_duration_seconds{group="Google", ping="8.8.8.8", phase="rtt", job="job-1", instance="127.0.0.1:9115"} > 0
-
丢包率表达式
1-avg_over_time(probe_success{instance=~"127.0.0.1:9115", group="广州移动"}[$__interval])
更改面板标题
添加图例显示
更改纵向坐标单位
往下拉找到 Time
-second(s)
添加右侧纵向坐标单位
有多少个丢包率的组就创建多少个
save保存
效果图