环境:
host1(docker1): 172.16.199.17/24
host2(docker2):172.16.199.27/24
host3(docker2):172.16.199.37/24
目的:
配置etcd集群服务,为后续flannel网络及kubernetes(k8s)集群提供基础服务
配置步骤:
主机1安装配置:
下载etcd服务(当前版本v3.2.1)
1 |
wget https://github.com/coreos/etcd/releases/download/v3.2.1/etcd-v3.2.1-linux-amd64.tar.gz |
下载完毕后,解压,将etcd及etcdctl文件拷贝到PATH可以搜索到的目录,例如/usr/bin
进入systmed的服务目录,创建etcd service使得以后可以通过systemd托管服务启动
1 |
[root@docker1 etcd-v3.2.1-linux-amd64]# cd /usr/lib/systemd/system/ |
创建名为 etcd.service文件,内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
[Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf User=root # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\" --listen-peer-urls=\"${ETCD_LISTEN_PEER_URLS}\" --advertise-client-urls=\"${ETCD_ADVERTISE_CLIENT_URLS}\" --initial-cluster-token=\"${ETCD_INITIAL_CLUSTER_TOKEN}\" --initial-cluster=\"${ETCD_INITIAL_CLUSTER}\" --initial-cluster-state=\"${ETCD_INITIAL_CLUSTER_STATE}\" " Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target |
创建 /var/lib/etcd目录
创建/etc/etcd/etcd.conf 目录及文件,文件内容如下:
1 2 3 4 5 6 7 8 9 |
ETCD_NAME="etcd01" ETCD_DATA_DIR="/var/lib/etcd/etcd01" ETCD_LISTEN_PEER_URLS="http://172.16.199.17:2380" ETCD_LISTEN_CLIENT_URLS="http://172.16.199.17:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.199.17:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.16.199.17:2379" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster1" ETCD_INITIAL_CLUSTER="etcd01=http://172.16.199.17:2380,etcd02=http://172.16.199.27:2380" |
主机2安装配置:
方法同主机1,在etcd.conf配置文件中将ETCD_NAME,DIR以及相关IP修改为etcd02和172.16.199.27,TOKEN保持不变
启动etcd服务:
在两台主机上分别执行
systemctl daemon-reload
systemctl start etcd.service
检查/var/log/messages,以及systemctl命令本身无错误输出
systemctl enable etcd.service
检验集群服务:
1 2 3 4 5 6 7 8 |
[root@docker1 system]# etcdctl cluster-health member 63e5296d334e3fa is healthy: got healthy result from http://172.16.199.17:2379 member 9558b2437a9b849a is healthy: got healthy result from http://172.16.199.27:2379 [root@docker1 system]# etcdctl member list 63e5296d334e3fa: name=etcd01 peerURLs=http://172.16.199.17:2380 clientURLs=http://172.16.199.17:2379 isLeader=true 9558b2437a9b849a: name=etcd02 peerURLs=http://172.16.199.27:2380 clientURLs=http://172.16.199.27:2379 isLeader=false [root@docker1 system]# etcdctl cluster heath |
1 2 3 4 5 6 7 |
[root@docker2 etcd-v3.2.1-linux-amd64]# etcdctl cluster-health member 63e5296d334e3fa is healthy: got healthy result from http://172.16.199.17:2379 member 9558b2437a9b849a is healthy: got healthy result from http://172.16.199.27:2379 cluster is healthy [root@docker2 etcd-v3.2.1-linux-amd64]# etcdctl member list 63e5296d334e3fa: name=etcd01 peerURLs=http://172.16.199.17:2380 clientURLs=http://172.16.199.17:2379 isLeader=true 9558b2437a9b849a: name=etcd02 peerURLs=http://172.16.199.27:2380 clientURLs=http://172.16.199.27:2379 isLeader=false |
测试etcd:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
[root@docker1 system]# etcdctl mkdir /testdir [root@docker1 system]# etcdctl ls /testdir [root@docker1 system]# etcdctl set /testdir/key1 value1 value1 [root@docker1 system]# etcdctl get /testdir/key1 value1 [root@docker1 system]# etcdctl -o extended get /testdir/key1 Key: /testdir/key1 Created-Index: 7 Modified-Index: 7 TTL: 0 Index: 7 value1 [root@docker1 system]# etcdctl ls --recursive /testdir /testdir/key1 |
1 2 3 4 5 6 7 8 |
[root@docker2 member]# etcdctl ls /testdir [root@docker2 member]# etcdctl ls --recursive /testdir /testdir/key1 [root@docker2 member]# etcdctl get /testdir/key1 value1 |
集群冗余测试:
停止host1 etcd服务,host2无法选举成为leader,这是因为etcd集群实际上至少需要三个节点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
[root@docker2 member]# tail -f /var/log/messages Jun 27 21:44:45 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 430 Jun 27 21:44:46 docker2 etcd: 9558b2437a9b849a is starting a new election at term 430 Jun 27 21:44:46 docker2 etcd: 9558b2437a9b849a became candidate at term 431 Jun 27 21:44:46 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 431 Jun 27 21:44:46 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 431 Jun 27 21:44:47 docker2 etcd: health check for peer 63e5296d334e3fa could not connect: dial tcp 172.16.199.17:2380: getsockopt: connection refused Jun 27 21:44:48 docker2 etcd: 9558b2437a9b849a is starting a new election at term 431 Jun 27 21:44:48 docker2 etcd: 9558b2437a9b849a became candidate at term 432 Jun 27 21:44:48 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 432 Jun 27 21:44:48 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 432 Jun 27 21:44:49 docker2 etcd: 9558b2437a9b849a is starting a new election at term 432 Jun 27 21:44:49 docker2 etcd: 9558b2437a9b849a became candidate at term 433 Jun 27 21:44:49 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 433 Jun 27 21:44:49 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 433 Jun 27 21:44:51 docker2 etcd: 9558b2437a9b849a is starting a new election at term 433 Jun 27 21:44:51 docker2 etcd: 9558b2437a9b849a became candidate at term 434 Jun 27 21:44:51 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 434 Jun 27 21:44:51 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 434 Jun 27 21:44:52 docker2 etcd: health check for peer 63e5296d334e3fa could not connect: dial tcp 172.16.199.17:2380: getsockopt: connection refused Jun 27 21:44:52 docker2 etcd: 9558b2437a9b849a is starting a new election at term 434 Jun 27 21:44:52 docker2 etcd: 9558b2437a9b849a became candidate at term 435 Jun 27 21:44:52 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 435 Jun 27 21:44:52 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 435 Jun 27 21:44:54 docker2 etcd: 9558b2437a9b849a is starting a new election at term 435 Jun 27 21:44:54 docker2 etcd: 9558b2437a9b849a became candidate at term 436 Jun 27 21:44:54 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 436 Jun 27 21:44:54 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 436 Jun 27 21:44:55 docker2 etcd: 9558b2437a9b849a is starting a new election at term 436 Jun 27 21:44:55 docker2 etcd: 9558b2437a9b849a became candidate at term 437 Jun 27 21:44:55 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 437 Jun 27 21:44:55 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 437 Jun 27 21:44:56 docker2 etcd: 9558b2437a9b849a is starting a new election at term 437 Jun 27 21:44:56 docker2 etcd: 9558b2437a9b849a became candidate at term 438 Jun 27 21:44:56 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 438 Jun 27 21:44:56 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 438 Jun 27 21:44:57 docker2 etcd: health check for peer 63e5296d334e3fa could not connect: dial tcp 172.16.199.17:2380: getsockopt: connection refused Jun 27 21:44:57 docker2 etcd: 9558b2437a9b849a is starting a new election at term 438 Jun 27 21:44:57 docker2 etcd: 9558b2437a9b849a became candidate at term 439 Jun 27 21:44:57 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 439 Jun 27 21:44:57 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 439 Jun 27 21:44:58 docker2 etcd: 9558b2437a9b849a is starting a new election at term 439 Jun 27 21:44:58 docker2 etcd: 9558b2437a9b849a became candidate at term 440 Jun 27 21:44:58 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 440 Jun 27 21:44:58 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 440 Jun 27 21:44:59 docker2 etcd: 9558b2437a9b849a is starting a new election at term 440 Jun 27 21:44:59 docker2 etcd: 9558b2437a9b849a became candidate at term 441 Jun 27 21:44:59 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 441 Jun 27 21:44:59 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 441 Jun 27 21:45:01 docker2 etcd: 9558b2437a9b849a is starting a new election at term 441 Jun 27 21:45:01 docker2 etcd: 9558b2437a9b849a became candidate at term 442 Jun 27 21:45:01 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 442 Jun 27 21:45:01 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 442 Jun 27 21:45:02 docker2 etcd: health check for peer 63e5296d334e3fa could not connect: dial tcp 172.16.199.17:2380: getsockopt: connection refused |
恢复host1 etcd服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Jun 27 21:45:02 docker2 etcd: peer 63e5296d334e3fa became active Jun 27 21:45:02 docker2 etcd: established a TCP streaming connection with peer 63e5296d334e3fa (stream MsgApp v2 writer) Jun 27 21:45:02 docker2 etcd: established a TCP streaming connection with peer 63e5296d334e3fa (stream Message writer) Jun 27 21:45:02 docker2 etcd: established a TCP streaming connection with peer 63e5296d334e3fa (stream MsgApp v2 reader) Jun 27 21:45:02 docker2 etcd: established a TCP streaming connection with peer 63e5296d334e3fa (stream Message reader) Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a is starting a new election at term 442 Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a became candidate at term 443 Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 9558b2437a9b849a at term 443 Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a [logterm: 61, index: 17] sent MsgVote request to 63e5296d334e3fa at term 443 Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a received MsgVoteResp from 63e5296d334e3fa at term 443 Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections Jun 27 21:45:02 docker2 etcd: 9558b2437a9b849a became leader at term 443 Jun 27 21:45:02 docker2 etcd: raft.node: 9558b2437a9b849a elected leader 9558b2437a9b849a at term 443 |
增加第三节点:
修改已有两个节点的etcd.conf中集群节点配置项
1 |
ETCD_INITIAL_CLUSTER="etcd01=http://172.16.199.17:2380,etcd02=http://172.16.199.27:2380" |
安装第三节点,配置文件如下:
1 2 3 4 5 6 7 8 9 10 |
[root@docker3 system]# cat /etc/etcd/etcd.conf ETCD_NAME="etcd03" ETCD_DATA_DIR="/var/lib/etcd/etcd03" ETCD_LISTEN_PEER_URLS="http://172.16.199.37:2380" ETCD_LISTEN_CLIENT_URLS="http://172.16.199.37:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.199.37:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.16.199.37:2379" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster1" ETCD_INITIAL_CLUSTER="etcd01=http://172.16.199.17:2380,etcd02=http://172.16.199.27:2380,etcd03=http://172.16.199.37:2380" |
注意由于集群在两个节点模式下已经测试启动过,系统已经生成的cluster id是关于两个节点的,直接修改上述配置并启动第三节点,第三节点会无法加入集群,并报以下问题:
1 2 3 |
Jun 27 22:17:06 docker3 etcd: request sent was ignored (cluster ID mismatch: peer[63e5296d334e3fa]=93ce6eea0aad94f1, local=de36ca03b996fb38) Jun 27 22:17:06 docker3 etcd: request sent was ignored (cluster ID mismatch: peer[63e5296d334e3fa]=93ce6eea0aad94f1, local=de36ca03b996fb38) Jun 27 22:17:06 docker3 etcd: request sent was ignored (cluster ID mismatch: peer[9558b2437a9b849a]=93ce6eea0aad94f1, local=de36ca03b996fb38) |
此时需要删除host1,host2上/var/lib/etcd/**/下的member文件夹,然后重启所有etcd服务,最终结果如下:
1 2 3 4 |
[root@docker1 etcd01]# etcdctl member list 63e5296d334e3fa: name=etcd01 peerURLs=http://172.16.199.17:2380 clientURLs=http://172.16.199.17:2379 isLeader=true 58f1fbf6c5601b97: name=etcd03 peerURLs=http://172.16.199.37:2380 clientURLs=http://172.16.199.37:2379 isLeader=false 9558b2437a9b849a: name=etcd02 peerURLs=http://172.16.199.27:2380 clientURLs=http://172.16.199.27:2379 isLeader=false |
上述新增第三节点的正确做法,应该在启动第三节点之前,在现有节点上add member
etcdctl member add new-member-name http://new-member-ip:2380
三节点冗余测试:
关闭host1上etcd服务,并在其它host上查看发现host2变成了leader:
1 2 3 4 |
[root@docker3 etcd03]# etcdctl member list 63e5296d334e3fa: name=etcd01 peerURLs=http://172.16.199.17:2380 clientURLs=http://172.16.199.17:2379 isLeader=false 58f1fbf6c5601b97: name=etcd03 peerURLs=http://172.16.199.37:2380 clientURLs=http://172.16.199.37:2379 isLeader=false 9558b2437a9b849a: name=etcd02 peerURLs=http://172.16.199.27:2380 clientURLs=http://172.16.199.27:2379 isLeader=true |
1 2 3 4 5 6 |
[root@docker3 etcd03]# etcdctl cluster-health failed to check the health of member 63e5296d334e3fa on http://172.16.199.17:2379: Get http://172.16.199.17:2379/health: dial tcp 172.16.199.17:2379: getsockopt: connection refused member 63e5296d334e3fa is unreachable: [http://172.16.199.17:2379] are all unreachable member 58f1fbf6c5601b97 is healthy: got healthy result from http://172.16.199.37:2379 member 9558b2437a9b849a is healthy: got healthy result from http://172.16.199.27:2379 cluster is healthy |
查看停止host1 etcd服务之前创建的key,可以获得:
1 2 |
[root@docker3 etcd03]# etcdctl get /testdir/key1 value1 |
创建新key(在非leader上执行):
1 2 3 4 |
[root@docker3 etcd03]# etcdctl set /testdir/key2 value2 value2 [root@docker3 etcd03]# etcdctl get /testdir/key2 value2 |
重新启动host1 etcd, 可以看到后创建的key2可以在host1上被查到。
注意:各个主机应该关闭firewalld服务,否则会出现类似提示无法获取某个节点健康状态的提示:
1 2 3 4 5 |
[root@docker1 etcd01]# etcdctl cluster-health member 63e5296d334e3fa is healthy: got healthy result from http://172.16.199.17:2379 failed to check the health of member 58f1fbf6c5601b97 on http://172.16.199.37:2379: Get http://172.16.199.37:2379/health: dial tcp 172.16.199.37:2379: i/o timeout member 58f1fbf6c5601b97 is unreachable: [http://172.16.199.37:2379] are all unreachable member 9558b2437a9b849a is healthy: got healthy result from http://172.16.199.27:2379 |
附录:
Active Peers | Majority | Failure Tolerance |
---|---|---|
1 peers | 1 peers | None |
3 peers | 2 peers | 1 peer |
4 peers | 3 peers | 1 peer |
5 peers | 3 peers | 2 peers |
6 peers | 4 peers | 2 peers |
7 peers | 4 peers | 3 peers |
8 peers | 5 peers | 3 peers |
9 peers | 5 peers | 4 peers |
http://www.infoq.com/cn/articles/etcd-interpretation-application-scenario-implement-principle/
文章评论