elasticsearch高可用的探索

文章发布较早，内容可能过时，阅读注意甄别。

es 部署集群，以保障服务高可用。

主机规划如下：

主机 IP	组件
192.168.111.3	elasticsearch-node1
192.168.111.4	elasticsearch-node2
192.168.111.5	elasticsearch-node3
192.168.111.6	elasticsearch-node0，kibana，logstash
192.168.111.16	nginx，filebeat

简述：

其中 node1，node2，node3 作为数据节点，而 node0 则作为集群当中与 kibana 连接的节点。
nginx 主机通过 filebeat 将日志转给 logstash，logstash 发给 es 集群，然后再由 kibana 展示。

# 1，安装配置。

# 1，安装依赖。

yum -y install lrzsz vim curl wget java

# 2，配置 yum 源。

cat > /etc/yum.repos.d/elk.repo << EOF
[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

1
2
3
4
5
6
7
8
9
10
11
12

# 3，安装。

在四台主机安装 elasticsearch。

yum -y install elasticsearch

接下来进行配置：

elk-node1

[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node1
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false

1
2
3
4
5
6
7
8
9
10

说明：

cluster.name：自定义集群名称
node.name：自定义节点名，不同主机名称不能一致。
path.data：data 存储路径，这里更改成自定义以应对日志的 big。
path.logs：log 存储路径，是为 es 自己的日志。

注意创建上边两项定义的两个文件目录。否则会启动失败。

mkdir -p /logs/elasticsearch6/log
cd /logs
chown -R elasticsearch.elasticsearch elasticsearch6/

1
2
3

network.host：es 监听地址，采用"0.0.0.0"，表示允许所有设备访问。
http.port：es 监听端口，可不取消注释，默认即此端口。
discovery.zen.ping.unicast.hosts：集群节点发现列表
discovery.zen.minimum_master_nodes：定义可成为 master 的节点数量。为避免脑裂的情况，建议设置成 2。参考 (opens new window)
xpack.security.enabled：添加这条，这条是配置 kibana 的安全机制，暂时关闭。

# 4，分发配置。

接着将配置分发给其他几台主机。

[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.4:/etc/elasticsearch/elasticsearch.yml
[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.5:/etc/elasticsearch/elasticsearch.yml
[root@localhost ~]$scp /etc/elasticsearch/elasticsearch.yml 192.168.111.6:/etc/elasticsearch/elasticsearch.yml

1
2
3

然后更改对应三台的配置，只需更改 node-name 即可。

列出另外三台的配置：

elk-node2

[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node2
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false

1
2
3
4
5
6
7
8
9
10

elk-node3

[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node3
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false

1
2
3
4
5
6
7
8
9
10

elk-node0

[root@localhost ~]$egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: ebei-elk
node.name: elk-node0
path.data: /logs/elasticsearch6
path.logs: /logs/elasticsearch6/log
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.111.3:9300","192.168.111.4:9300","192.168.111.5:9300","192.168.111.6:9300"]
discovery.zen.minimum_master_nodes: 2
xpack.security.enabled: false

node.master: false
node.data: false
node.ingest: false

1
2
3
4
5
6
7
8
9
10
11
12
13
14

注意最后多了三条配置。

一开始我也并没有想到这一点，而且也并没有 node0 这一节点的，但当我开始配置 kibana 的配置时发现，kibana 只能写一条 es 的连接信息，于是乎，才发现有这样一个问题，后来了解到，官方给到了一种解决方式 (opens new window)。

方式就是添加如上三条配置，然后将这一节点变成与 kibana 对接的桥梁，但是并不参与数据方面的处理。

最后就是启动几台主机上的 es 服务。

systemctl daemon-reload
systemctl enable elasticsearch.service
systemctl start elasticsearch.service
systemctl status elasticsearch.service

1
2
3
4

# 2，查看集群信息。

查看 es 是否正常启动。

[root@localhost ~]$curl -X GET "127.0.0.1:9200/"
{
  "name" : "elk-node1",
  "cluster_name" : "ebei-elk",
  "cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ",
  "version" : {
    "number" : "6.5.3",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "159a78a",
    "build_date" : "2018-12-06T20:11:28.826501Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

另一种方式查看。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/health?pretty'
{
  "cluster_name" : "ebei-elk",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 22,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

查看集群状态。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes'
192.168.111.3 48 39 0 0.22 0.07 0.06 mdi * elk-node1  #带*号的表示master
192.168.111.6 41 98 0 0.03 0.06 0.05 -   - elk-node0  #注意这个节点并不参与数据处理。
192.168.111.4 44 57 0 0.00 0.01 0.05 mdi - elk-node2
192.168.111.5 27 62 0 0.00 0.01 0.05 mdi - elk-node3

1
2
3
4
5

另一种方式查看。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.3           34          39   0    0.13    0.10     0.07 mdi       *      elk-node1
192.168.111.6           20          98   0    0.01    0.05     0.05 -         -      elk-node0
192.168.111.4           32          57   0    0.00    0.01     0.05 mdi       -      elk-node2
192.168.111.5           38          62   0    0.00    0.01     0.05 mdi       -      elk-node3

1
2
3
4
5
6

查看集群详细信息。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/state/nodes?pretty'
{
  "cluster_name" : "ebei-elk",  #集群名称
  "compressed_size_in_bytes" : 14291,
  "cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ",  #集群id
  "nodes" : {
    "oCNARqdUT5KXLZTlcS22GA" : {  #node的ID值
      "name" : "elk-node1",   #node名称
      "ephemeral_id" : "y-NCFJULTEmWdWjNjCPS2A",
      "transport_address" : "192.168.111.3:9300", #集群通讯地址
      "attributes" : {
        "ml.machine_memory" : "8202727424",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20",
        "ml.enabled" : "true"
      }
    },
    "8F_rZuR1TByEb6bXz0EgzA" : {
      "name" : "elk-node0",
      "ephemeral_id" : "b3CtPKpyRUahT4njpRqjlQ",
      "transport_address" : "192.168.111.6:9300",
      "attributes" : {
        "ml.machine_memory" : "8202039296",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      }
    },
    "ptEOHzaPTgmlqW3NRhd7SQ" : {
      "name" : "elk-node2",
      "ephemeral_id" : "YgypZZNcTfWcIYDhOlUAzw",
      "transport_address" : "192.168.111.4:9300",
      "attributes" : {
        "ml.machine_memory" : "8202039296",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      }
    },
    "PJYNTw5nTeS2kjjft4UcrA" : {
      "name" : "elk-node3",
      "ephemeral_id" : "C2nnREt0TOGKeDw9OQRJUA",
      "transport_address" : "192.168.111.5:9300",
      "attributes" : {
        "ml.machine_memory" : "8202031104",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      }
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

查询 master。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cluster/state/master_node?pretty'
{
  "cluster_name" : "ebei-elk",
  "compressed_size_in_bytes" : 14291,
  "cluster_uuid" : "53LLexx8RSW16nE4lsJMQQ",
  "master_node" : "oCNARqdUT5KXLZTlcS22GA"
}
[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/master?v'
id                     host          ip            node
oCNARqdUT5KXLZTlcS22GA 192.168.111.3 192.168.111.3 elk-node1

1
2
3
4
5
6
7
8
9
10

查询集群健康状态。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/health?v'
epoch      timestamp cluster  status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1545386516 10:01:56  ebei-elk green           4         3     22  11    0    0        0             0                  -                100.0%

1
2
3

# 3，高可用验证。

通过刚刚的测试，已经看出，master 在 node1，现在将 node1 的 es 停掉，看看是否会自动漂移。

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.3           36          39   0    0.03    0.04     0.05 mdi       *      elk-node1
192.168.111.6           22          98   0    0.00    0.02     0.05 -         -      elk-node0
192.168.111.4           38          57   0    0.00    0.01     0.05 mdi       -      elk-node2
192.168.111.5           44          62   0    0.00    0.01     0.05 mdi       -      elk-node3
[root@localhost ~]$systemctl stop elasticsearch

1
2
3
4
5
6
7

然后到另外一个节点查看一下：

[root@localhost ~]$curl -XGET 'http://127.0.0.1:9200/_cat/nodes?v'
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.111.4           45          58   0    0.00    0.01     0.05 mdi       -      elk-node2
192.168.111.5           30          62   0    0.00    0.01     0.05 mdi       *      elk-node3
192.168.111.6           24          98   0    0.00    0.02     0.05 -         -      elk-node0

1
2
3
4
5

可以看到 master 已经跑到了 node3。

# 4，对接配置。

filebeat 往 logstash 转发日志的配置还是原来的样子，只不过 logstash 的配置文件要更改一下 output 了，如下：

input {
    beats {
        host => 192.168.111.16
        port => 5044
    }
}
filter {
    if [fields][logtype] == "nginx-access" {
        json {
        source => "message"
        target => "data"
        }
    }
    if [fields][logtype] == "nginx-error" {
        json {
        source => "message"
        target => "data"
        }
    }
}
output {
    if [fields][logtype] == "nginx-access" {
        elasticsearch {
        hosts => ["192.168.111.3:9200","192.168.111.4:9200","192.168.111.5:9200"]
        index => "logstash-nginx-access-%{+YYYY.MM.dd}"
        }
    }
    if [fields][logtype] == "nginx-error" {
        elasticsearch {
        hosts => ["192.168.111.3:9200","192.168.111.4:9200","192.168.111.5:9200"]
        index => "logstash-nginx-error-%{+YYYY.MM.dd}"
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

再看 kibana 的配置。

[root@localhost logs]$egrep -v "^$|^#" /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://127.0.0.1:9200"
kibana.index: ".kibana"
xpack.security.enabled: false

1
2
3
4
5
6

然后访问即可，可以看到日志能够源源不断的打进来，如果 es 当中某个节点挂了，并不会影响 kibana 当中日志的展示，但是，此时可以看到与 kibana 对接的 es 还是单个，所以这种还不能算是绝对高可用。

# 5，高可用。

所以正确的做法应该是，撤掉刚刚的 node0，然后将集群当中的三台 es 通过前端 nginx 做一个代理，然后让 kibana 连接 nginx 配置的地址即可实现高可用。

现在就在 192.168.111.16 的 nginx 添加配置：

upstream elasticsearch {
    zone elasticsearch 64K;
    server 192.168.111.3:9200;
    server 192.168.111.4:9200;
    server 192.168.111.5:9200;
}
server {
    listen 9200;
    server_name 192.168.111.16;
    location / {
        proxy_pass     http://elasticsearch;
        proxy_redirect    off;
        proxy_set_header  Host             $host;
        proxy_set_header  X-Real-IP        $remote_addr;
        proxy_set_header  X-Forwarded-For  $proxy_add_x_forwarded_for;
    }
    access_log logs/es_access.log;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

然后加载配置。

nginx -t
nginx -s reload

1
2

接着更改 kibana 当中的连接地址：

[root@localhost logs]$!egr
egrep -v "^$|^#" /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://192.168.111.16:9200"
kibana.index: ".kibana"
xpack.security.enabled: false

1
2
3
4
5
6
7

#elk #elasticsearch

上次更新: 2024/07/04, 22:40:37

← logstash配置geoip画访问地域热图 elasticsearch索引管理→

elasticsearch高可用的探索

# 1，安装配置。

# 1， 安装依赖。

# 2， 配置 yum 源。

# 3， 安装。

# 4， 分发配置。

# 2，查看集群信息。

# 3，高可用验证。

# 4，对接配置。

# 5，高可用。

# 1，安装依赖。

# 2，配置 yum 源。

# 3，安装。

# 4，分发配置。