elasticsearch高可用的探索
es 部署集群,以保障服务高可用。
主机规划如下:
主机 IP | 组件 |
---|---|
192.168.111.3 | elasticsearch-node1 |
192.168.111.4 | elasticsearch-node2 |
192.168.111.5 | elasticsearch-node3 |
192.168.111.6 | elasticsearch-node0,kibana,logstash |
192.168.111.16 | nginx,filebeat |
简述:
- 其中 node1,node2,node3 作为数据节点,而 node0 则作为集群当中与 kibana 连接的节点。
- nginx 主机通过 filebeat 将日志转给 logstash,logstash 发给 es 集群,然后再由 kibana 展示。
# 1,安装配置。
# 1, 安装依赖。
yum -y install lrzsz vim curl wget java
# 2, 配置 yum 源。
cat > /etc/yum.repos.d/elk.repo << EOF
[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
2
3
4
5
6
7
8
9
10
11
12
# 3, 安装。
在四台主机安装 elasticsearch。
yum -y install elasticsearch
接下来进行配置:
elk-node1
[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node1
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false
2
3
4
5
6
7
8
9
10
说明:
cluster.name
:自定义集群名称node.name
:自定义节点名,不同主机名称不能一致。path.data
:data 存储路径,这里更改成自定义以应对日志的 big。path.logs
:log 存储路径,是为 es 自己的日志。注意创建上边两项定义的两个文件目录。否则会启动失败。
mkdir -p /logs/elasticsearch6/log cd /logs chown -R elasticsearch.elasticsearch elasticsearch6/
1
2
3network.host
:es 监听地址,采用"0.0.0.0"
,表示允许所有设备访问。http.port
:es 监听端口,可不取消注释,默认即此端口。discovery.zen.ping.unicast.hosts
:集群节点发现列表discovery.zen.minimum_master_nodes
:定义可成为 master 的节点数量。为避免脑裂的情况,建议设置成 2。参考 (opens new window)xpack.security.enabled
:添加这条,这条是配置 kibana 的安全机制,暂时关闭。
# 4, 分发配置。
接着将配置分发给其他几台主机。
[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.4:/etc/elasticsearch/elasticsearch.yml
[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.5:/etc/elasticsearch/elasticsearch.yml
[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.6:/etc/elasticsearch/elasticsearch.yml
2
3
然后更改对应三台的配置,只需更改 node-name 即可。
列出另外三台的配置:
elk-node2
[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node2
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false
2
3
4
5
6
7
8
9
10
elk-node3
[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node3
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false
2
3
4
5
6
7
8
9
10
elk-node0
[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node0
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false
node.master: false
node.data: false
node.ingest: false
2
3
4
5
6
7
8
9
10
11
12
13
14
注意最后多了三条配置。
一开始我也并没有想到这一点,而且也并没有 node0 这一节点的,但当我开始配置 kibana 的配置时发现,kibana 只能写一条 es 的连接信息,于是乎,才发现有这样一个问题,后来了解到,官方给到了一种解决方式 (opens new window)。
方式就是添加如上三条配置,然后将这一节点变成与 kibana 对接的桥梁,但是并不参与数据方面的处理。
最后就是启动几台主机上的 es 服务。
systemctl daemon-reload
systemctl enable elasticsearch.service
systemctl start elasticsearch.service
systemctl status elasticsearch.service
2
3
4
# 2,查看集群信息。
- 查看 es 是否正常启动。
[root@localhost ~]$curl -X GET "127.0.0.1:9200/"
{
"name" : "elk-node1",
"cluster_name" : "ebei-elk",
"cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ",
"version" : {
"number" : "6.5.3",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "159a78a",
"build_date" : "2018-12-06T20:11:28.826501Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
- 另一种方式查看。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/health?pretty'
{
"cluster_name" : "ebei-elk",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 3,
"active_primary_shards" : 11,
"active_shards" : 22,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
- 查看集群状态。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes'
192.168.111.3 48 39 0 0.22 0.07 0.06 mdi * elk-node1 #带*号的表示master
192.168.111.6 41 98 0 0.03 0.06 0.05 - - elk-node0 #注意这个节点并不参与数据处理。
192.168.111.4 44 57 0 0.00 0.01 0.05 mdi - elk-node2
192.168.111.5 27 62 0 0.00 0.01 0.05 mdi - elk-node3
2
3
4
5
- 另一种方式查看。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.3 34 39 0 0.13 0.10 0.07 mdi * elk-node1
192.168.111.6 20 98 0 0.01 0.05 0.05 - - elk-node0
192.168.111.4 32 57 0 0.00 0.01 0.05 mdi - elk-node2
192.168.111.5 38 62 0 0.00 0.01 0.05 mdi - elk-node3
2
3
4
5
6
- 查看集群详细信息。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/state/nodes?pretty'
{
"cluster_name" : "ebei-elk", #集群名称
"compressed_size_in_bytes" : 14291,
"cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ", #集群id
"nodes" : {
"oCNARqdUT5KXLZTlcS22GA" : { #node的ID值
"name" : "elk-node1", #node名称
"ephemeral_id" : "y-NCFJULTEmWdWjNjCPS2A",
"transport_address" : "192.168.111.3:9300", #集群通讯地址
"attributes" : {
"ml.machine_memory" : "8202727424",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
}
},
"8F_rZuR1TByEb6bXz0EgzA" : {
"name" : "elk-node0",
"ephemeral_id" : "b3CtPKpyRUahT4njpRqjlQ",
"transport_address" : "192.168.111.6:9300",
"attributes" : {
"ml.machine_memory" : "8202039296",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"ml.enabled" : "true"
}
},
"ptEOHzaPTgmlqW3NRhd7SQ" : {
"name" : "elk-node2",
"ephemeral_id" : "YgypZZNcTfWcIYDhOlUAzw",
"transport_address" : "192.168.111.4:9300",
"attributes" : {
"ml.machine_memory" : "8202039296",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"ml.enabled" : "true"
}
},
"PJYNTw5nTeS2kjjft4UcrA" : {
"name" : "elk-node3",
"ephemeral_id" : "C2nnREt0TOGKeDw9OQRJUA",
"transport_address" : "192.168.111.5:9300",
"attributes" : {
"ml.machine_memory" : "8202031104",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"ml.enabled" : "true"
}
}
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
- 查询 master。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/state/master_node?pretty'
{
"cluster_name" : "ebei-elk",
"compressed_size_in_bytes" : 14291,
"cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ",
"master_node" : "oCNARqdUT5KXLZTlcS22GA"
}
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/master?v'
id host ip node
oCNARqdUT5KXLZTlcS22GA 192.168.111.3 192.168.111.3 elk-node1
2
3
4
5
6
7
8
9
10
- 查询集群健康状态。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1545386516 10:01:56 ebei-elk green 4 3 22 11 0 0 0 0 - 100.0%
2
3
# 3,高可用验证。
通过刚刚的测试,已经看出,master 在 node1,现在将 node1 的 es 停掉,看看是否会自动漂移。
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.3 36 39 0 0.03 0.04 0.05 mdi * elk-node1
192.168.111.6 22 98 0 0.00 0.02 0.05 - - elk-node0
192.168.111.4 38 57 0 0.00 0.01 0.05 mdi - elk-node2
192.168.111.5 44 62 0 0.00 0.01 0.05 mdi - elk-node3
[root@localhost ~]$systemctl stop elasticsearch
2
3
4
5
6
7
然后到另外一个节点查看一下:
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.4 45 58 0 0.00 0.01 0.05 mdi - elk-node2
192.168.111.5 30 62 0 0.00 0.01 0.05 mdi * elk-node3
192.168.111.6 24 98 0 0.00 0.02 0.05 - - elk-node0
2
3
4
5
可以看到 master 已经跑到了 node3。
# 4,对接配置。
filebeat 往 logstash 转发日志的配置还是原来的样子,只不过 logstash 的配置文件要更改一下 output 了,如下:
input {
beats {
host => 192.168.111.16
port => 5044
}
}
filter {
if [fields][logtype] == "nginx-access" {
json {
source => "message"
target => "data"
}
}
if [fields][logtype] == "nginx-error" {
json {
source => "message"
target => "data"
}
}
}
output {
if [fields][logtype] == "nginx-access" {
elasticsearch {
hosts => ["192.168.111.3:9200","192.168.111.4:9200","192.168.111.5:9200"]
index => "logstash-nginx-access-%{+YYYY.MM.dd}"
}
}
if [fields][logtype] == "nginx-error" {
elasticsearch {
hosts => ["192.168.111.3:9200","192.168.111.4:9200","192.168.111.5:9200"]
index => "logstash-nginx-error-%{+YYYY.MM.dd}"
}
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
再看 kibana 的配置。
[root@localhost logs]$egrep -v "^$|^#" /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://127.0.0.1:9200"
kibana.index: ".kibana"
xpack.security.enabled: false
2
3
4
5
6
然后访问即可,可以看到日志能够源源不断的打进来,如果 es 当中某个节点挂了,并不会影响 kibana 当中日志的展示,但是,此时可以看到与 kibana 对接的 es 还是单个,所以这种还不能算是绝对高可用。
# 5,高可用。
所以正确的做法应该是,撤掉刚刚的 node0,然后将集群当中的三台 es 通过前端 nginx 做一个代理,然后让 kibana 连接 nginx 配置的地址即可实现高可用。
现在就在 192.168.111.16 的 nginx 添加配置:
upstream elasticsearch {
zone elasticsearch 64K;
server 192.168.111.3:9200;
server 192.168.111.4:9200;
server 192.168.111.5:9200;
}
server {
listen 9200;
server_name 192.168.111.16;
location / {
proxy_pass http://elasticsearch;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
access_log logs/es_access.log;
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
然后加载配置。
nginx -t
nginx -s reload
2
接着更改 kibana 当中的连接地址:
[root@localhost logs]$!egr
egrep -v "^$|^#" /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://192.168.111.16:9200"
kibana.index: ".kibana"
xpack.security.enabled: false
2
3
4
5
6
7