Grafana高可用集群安装配置

官方文档:https://grafana.com/docs/grafana/latest/administration/set-up-for-high-availability/

1、添加Grafana社区版对应的Yum源
vi etc/yum.repos.d/grafana.repo

[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
2、安装Grafana
yum install grafana
3、启动Grafana
systemctl daemon-reload
systemctl start grafana-server
systemctl status grafana-server

# 设置自启动
systemctl enable grafana-server
4、创建mysql数据库,用于保存集群配置
mysql -uroot -p

# 创建grafana数据库
create database grafana;

# xxxxxxxx 表示数据库密码,授权相应的IP地址及hostname访问数据库
grant all on grafana.* to 'grafana'@'10.255.200.%' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'localhost' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops01' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops02' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops03' identified by 'xxxxxxxx';

# 保存权限
flush privileges;
5、修改Grafana配置,使用mysql数据库
vi /etc/grafana/grafana.ini

#################################### Database ####################################
[database]
# You can configure the database connection by specifying type, host, name, user and password
# as separate properties or as on string using the url properties.

# Either "mysql", "postgres" or "sqlite3", it's your choice
type = mysql
host = 10.255.200.1:3306
name = grafana
user = grafana
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
password = xxxxxxxx
6、修改默认端口为3001,将默认端口3000配置为tengine监听端口,禁用使用情况反馈,因为可能因为联网问题造成反馈时报错
vi /etc/grafana/grafana.ini

#################################### Server ####################################
[server]
# Protocol (http, https, h2, socket)
;protocol = http

# The ip address to bind to, empty will bind to all interfaces
;http_addr =

# The http port  to use
http_port = 3001

#################################### Analytics ####################################
[analytics]
# Server reporting, sends usage counters to stats.grafana.org every 24 hours.
# No ip addresses are being tracked, only simple counters to track
# running instances, dashboard and error counts. It is very helpful to us.
# Change this option to false to disable reporting.
reporting_enabled = false
7、重启Grafana服务
systemctl restart grafana-server
systemctl status grafana-server
8、配置tengine
vi /sas/tengine/conf/conf.d/grafana.conf

upstream grafana {
    server 10.255.200.1:3001;
    server 10.255.200.2:3001;
    server 10.255.200.3:3001;

    session_sticky;
}

server {
    listen 3000 ssl backlog=32768;
    server_name grafana.hbrtv.org;
    ssl_certificate   /sas/tengine/sslkey/devops.crt;
    ssl_certificate_key  /sas/tengine/sslkey/devops.key;
    ssl_session_timeout 5m;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;

    location / {
        proxy_pass http://grafana;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}
9、配置防火墙
firewall-cmd --zone=public --add-port=3000/tcp --permanent
# 如果配置错误,可使用以下格式删相应端口
# firewall-cmd --zone=public --remove-port=3000/tcp --permanent

firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="3001" accept"
# 如果配置错误,可使用以下格式删相应端口
# firewall-cmd --permanent --remove-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="3001" accept"

# 重新载入防火墙配置,使配置生效
firewall-cmd --reload
10、重启nginx服务

systemctl restart tengine
systemctl status tengine

安装Promethus

官方参考文档:https://prometheus.io/docs/prometheus/latest/getting_started/

1、防火墙配置
firewall-cmd --zone=public --add-port=9090/tcp --permanent

firewall-cmd --reload
2、下载prometheus

mkdir /ssd/prometheus/

mkdir /sas/prometheus/
cd /sas/prometheus/

wget https://github.com/prometheus/prometheus/releases/download/v2.22.2/prometheus-2.22.2.linux-amd64.tar.gz

tar -xvfz prometheus-2.22.2.linux-amd64.tar.gz
mv prometheus-2.22.2.linux-amd64 prometheus
cd prometheus
3、修改prometheus,监控自身

vi /sas/prometheus/prometheus/prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'devops-monitor01'

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
    - targets: ['10.255.200.1:9090','10.255.200.2:9090','10.255.200.3:9090']
4、编写Systemd服务管理配置文件
vi /usr/lib/systemd/system/prometheus.service

[Unit]
Description=Prometheus Services
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/sas/prometheus/prometheus/prometheus --config.file=/sas/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/ssd/prometheus/
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
5、启动prometheus服务
systemctl daemon-reload

systemctl start prometheus
systemctl status prometheus
systemctl enable prometheus
6、查看监控数据

浏览器访问:http://10.255.200.1:9090/targets 。可看到3个节点都正常上线,状态为:UP。

浏览器访问:http://10.255.200.1:9090/metrics 。可看到相应的监控数据。

 

7、在grafana中添加prometheus数据源

使用浏览器打开:https://10.255.200.5:3000 , 在 “Configuration” -> “Data Sources” 中添加prometheus数据源。时序数据库类型(Time series databases)选 “Prometheus”,选中后填写相应的名称及URL,点测试并保存即可。

 

安装node_exporter

1、防火墙配置
firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="9100" accept"

# 重新载入防火墙配置,使配置生效
firewall-cmd --reload
2、下载node_exporter
最新版下载链接:https://github.com/prometheus/node_exporter/releases
cd /sas/prometheus/
wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz

tar -xzvf node_exporter-1.0.1.linux-amd64.tar.gz
mv node_exporter-1.0.1.linux-amd64 node_exporter
3、编写Systemd服务管理配置文件
vi /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=Prometheus Node Exporter Services
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/sas/prometheus/node_exporter/node_exporter
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
启动node_exporter可以指定如下参数, 过滤对应的磁盘挂载点
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|mnt/.+|var/lib/docker/.+)($|/) --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$

 

4、启动node_exporter服务
systemctl daemon-reload

systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporter
5、配置prometheus采集node_exporter监控数据

vi /sas/prometheus/prometheus/prometheus.yml

# scrape_configs下添中新的监控任务
scrape_configs:

  ......

  - job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
    - targets: ['10.255.200.1:9100','10.255.200.2:9100','10.255.200.3:9100']

# 重新启动prometheus服务
systemctl restart prometheus
systemctl status prometheus
6、在grafana中配置监控信息展示模版

使用浏览器打开:https://10.255.200.5:3000 , 在 “DashBoards” -> “Manage” 中导入node_exporter展示模版,相关的模版可以从“https://grafana.com/grafana/dashboards” 搜索选用,此次选用ID为8919的中文node_exporter展示模版,从“https://grafana.com/grafana/dashboards/8919” 下载JSON文件,然后导入。或者服务器可以联网的情况下,在导入界面输入模版ID,从grafana服务器直接载入(Load)。或者是将模版具体的JSON文件内容复制粘贴到对应的配置文本框中载入。

 

 

 

发表评论