Grafana高可用集群安装配置

2020年11月30日 3245浏览 linux 发表评论

文章目录

1 安装Promethus
2 安装node_exporter

官方文档：https://grafana.com/docs/grafana/latest/administration/set-up-for-high-availability/

1、添加Grafana社区版对应的Yum源

vi etc/yum.repos.d/grafana.repo

[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

2、安装Grafana

yum install grafana

3、启动Grafana

systemctl daemon-reload
systemctl start grafana-server
systemctl status grafana-server

# 设置自启动
systemctl enable grafana-server

4、创建mysql数据库，用于保存集群配置

mysql -uroot -p

# 创建grafana数据库
create database grafana;

# xxxxxxxx 表示数据库密码，授权相应的IP地址及hostname访问数据库
grant all on grafana.* to 'grafana'@'10.255.200.%' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'localhost' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops01' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops02' identified by 'xxxxxxxx';
grant all on grafana.* to 'grafana'@'devops03' identified by 'xxxxxxxx';

# 保存权限
flush privileges;

5、修改Grafana配置，使用mysql数据库

vi /etc/grafana/grafana.ini

#################################### Database ####################################
[database]
# You can configure the database connection by specifying type, host, name, user and password
# as separate properties or as on string using the url properties.

# Either "mysql", "postgres" or "sqlite3", it's your choice
type = mysql
host = 10.255.200.1:3306
name = grafana
user = grafana
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
password = xxxxxxxx

6、修改默认端口为3001，将默认端口3000配置为tengine监听端口，禁用使用情况反馈，因为可能因为联网问题造成反馈时报错

vi /etc/grafana/grafana.ini

#################################### Server ####################################
[server]
# Protocol (http, https, h2, socket)
;protocol = http

# The ip address to bind to, empty will bind to all interfaces
;http_addr =

# The http port  to use
http_port = 3001

#################################### Analytics ####################################
[analytics]
# Server reporting, sends usage counters to stats.grafana.org every 24 hours.
# No ip addresses are being tracked, only simple counters to track
# running instances, dashboard and error counts. It is very helpful to us.
# Change this option to false to disable reporting.
reporting_enabled = false

7、重启Grafana服务

systemctl restart grafana-server
systemctl status grafana-server

8、配置tengine

vi /sas/tengine/conf/conf.d/grafana.conf

upstream grafana {
    server 10.255.200.1:3001;
    server 10.255.200.2:3001;
    server 10.255.200.3:3001;

    session_sticky;
}

server {
    listen 3000 ssl backlog=32768;
    server_name grafana.hbrtv.org;
    ssl_certificate   /sas/tengine/sslkey/devops.crt;
    ssl_certificate_key  /sas/tengine/sslkey/devops.key;
    ssl_session_timeout 5m;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;

    location / {
        proxy_pass http://grafana;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

9、配置防火墙

firewall-cmd --zone=public --add-port=3000/tcp --permanent
# 如果配置错误，可使用以下格式删相应端口
# firewall-cmd --zone=public --remove-port=3000/tcp --permanent

firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="3001" accept"
# 如果配置错误，可使用以下格式删相应端口
# firewall-cmd --permanent --remove-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="3001" accept"

# 重新载入防火墙配置，使配置生效
firewall-cmd --reload

10、重启nginx服务


systemctl restart tengine
systemctl status tengine

安装Promethus

官方参考文档：https://prometheus.io/docs/prometheus/latest/getting_started/

1、防火墙配置

firewall-cmd --zone=public --add-port=9090/tcp --permanent

firewall-cmd --reload

2、下载prometheus


mkdir /ssd/prometheus/

mkdir /sas/prometheus/
cd /sas/prometheus/

wget https://github.com/prometheus/prometheus/releases/download/v2.22.2/prometheus-2.22.2.linux-amd64.tar.gz

tar -xvfz prometheus-2.22.2.linux-amd64.tar.gz
mv prometheus-2.22.2.linux-amd64 prometheus
cd prometheus

3、修改prometheus，监控自身


vi /sas/prometheus/prometheus/prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'devops-monitor01'

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
    - targets: ['10.255.200.1:9090','10.255.200.2:9090','10.255.200.3:9090']

4、编写Systemd服务管理配置文件

vi /usr/lib/systemd/system/prometheus.service

[Unit]
Description=Prometheus Services
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/sas/prometheus/prometheus/prometheus --config.file=/sas/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/ssd/prometheus/
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

5、启动prometheus服务

systemctl daemon-reload

systemctl start prometheus
systemctl status prometheus
systemctl enable prometheus

6、查看监控数据

浏览器访问：http://10.255.200.1:9090/targets 。可看到3个节点都正常上线，状态为：UP。

浏览器访问：http://10.255.200.1:9090/metrics 。可看到相应的监控数据。

7、在grafana中添加prometheus数据源

使用浏览器打开：https://10.255.200.5:3000 , 在 “Configuration” -> “Data Sources” 中添加prometheus数据源。时序数据库类型（Time series databases）选 “Prometheus”，选中后填写相应的名称及URL，点测试并保存即可。

安装node_exporter

1、防火墙配置

firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="10.255.200.1/30" port protocol="tcp" port="9100" accept"

# 重新载入防火墙配置，使配置生效
firewall-cmd --reload

2、下载node_exporter

最新版下载链接：https://github.com/prometheus/node_exporter/releases

cd /sas/prometheus/
wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz

tar -xzvf node_exporter-1.0.1.linux-amd64.tar.gz
mv node_exporter-1.0.1.linux-amd64 node_exporter

3、编写Systemd服务管理配置文件

vi /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=Prometheus Node Exporter Services
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/sas/prometheus/node_exporter/node_exporter
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

启动node_exporter可以指定如下参数， 过滤对应的磁盘挂载点
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|mnt/.+|var/lib/docker/.+)($|/) --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$

4、启动node_exporter服务

systemctl daemon-reload

systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporter

5、配置prometheus采集node_exporter监控数据


vi /sas/prometheus/prometheus/prometheus.yml

# scrape_configs下添中新的监控任务
scrape_configs:

  ......

  - job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
    - targets: ['10.255.200.1:9100','10.255.200.2:9100','10.255.200.3:9100']

# 重新启动prometheus服务
systemctl restart prometheus
systemctl status prometheus

6、在grafana中配置监控信息展示模版

使用浏览器打开：https://10.255.200.5:3000 , 在 “DashBoards” -> “Manage” 中导入node_exporter展示模版，相关的模版可以从“https://grafana.com/grafana/dashboards” 搜索选用，此次选用ID为8919的中文node_exporter展示模版，从“https://grafana.com/grafana/dashboards/8919” 下载JSON文件，然后导入。或者服务器可以联网的情况下，在导入界面输入模版ID，从grafana服务器直接载入（Load）。或者是将模版具体的JSON文件内容复制粘贴到对应的配置文本框中载入。

1、添加Grafana社区版对应的Yum源

2、安装Grafana

3、启动Grafana

4、创建mysql数据库，用于保存集群配置

5、修改Grafana配置，使用mysql数据库

6、修改默认端口为3001，将默认端口3000配置为tengine监听端口，禁用使用情况反馈，因为可能因为联网问题造成反馈时报错

7、重启Grafana服务

8、配置tengine

9、配置防火墙

10、重启nginx服务

安装Promethus

1、防火墙配置

2、下载prometheus

3、修改prometheus，监控自身

4、编写Systemd服务管理配置文件

5、启动prometheus服务

6、查看监控数据

7、在grafana中添加prometheus数据源

安装node_exporter

1、防火墙配置

2、下载node_exporter

3、编写Systemd服务管理配置文件

4、启动node_exporter服务

5、配置prometheus采集node_exporter监控数据

6、在grafana中配置监控信息展示模版

发表评论 取消回复

发表评论取消回复