接上文使用 Docker 配置 Redis 主从复制完成之后,这篇文章主要介绍如何使用 Docker 在本机搭建 Redis 的哨兵,内容包括涉及的目录结构、docker-compose.yml
的编写。
目录结构
本文将采用如下的目录结构,其中 data
目录将用于存放各个容器的数据,server
目录存放 docker-compose.yml
以及针对 master
和 slave
节点的配置文件,sentinel
目录存放哨兵的配置文件和 docker-compose.yml
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| . ├── data │ ├── redis-master │ │ └── dump.rdb │ ├── redis-slave-1 │ │ └── dump.rdb │ └── redis-slave-2 │ └── dump.rdb ├── sentinel │ ├── docker-compose.yml │ └── redis-sentinel.conf └── server ├── docker-compose.yml ├── redis-master.conf └── redis-slave.conf
|
配置哨兵
节点配置示意图
下面的示意图中,将采用 Redis 官方文档所使用的表示方法,M
代表一个 Master 节点,R
代表一个 Replica 节点,S
代表一个 Sentinel 节点。
从整个集群的稳定性角度考虑,首先 Master 节点和各个 Replica 节点不应同时处于同一台服务器上,以避免单台虚拟机或物理机不可用造成整个集群失效。
1 2 3 4 5 6 7 8 9 10 11
| +----+ | M1 | | S1 | +----+ | +----+ | +----+ | R2 |----+----| R3 | | S2 | | S3 | +----+ +----+
配置最少投票节点为2,即有2个哨兵节点投票选举出新的Master即可完成切换。
|
在本示例中,我将使用如下的节点配置,各个节点运行在不同的 Docker 容器中,来模拟运行在不同服务器中的效果:
1 2 3 4 5 6 7 8 9
| +----+ +----+ +----+ | M1 | | R1 | | R2 | +----+ +----+ +----+ | | | +--------+--------+ | | | +----+ +----+ +----+ | S1 | | S2 | | S3 | +----+ +----+ +----+
|
编辑配置文件
编辑 redis-sentinel-1.conf
,修改下列配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| bind 127.0.0.1
# 哨兵的端口号 # 因为各个哨兵节点会运行在单独的Docker容器中 # 所以无需担心端口重复使用 # 如果需要在单机 port 26379
# 配置哨兵的监控参数 # 格式:sentinel monitor <master-name> <ip> <redis-port> <quorum> # master-name是为这个被监控的master起的名字 # ip是被监控的master的IP或主机名。因为Docker容器之间可以使用容器名访问,所以这里写master节点的容器名 # redis-port是被监控节点所监听的端口号 # quorom设定了当几个哨兵判定这个节点失效后,才认为这个节点真的失效了 sentinel monitor local-master 127.0.0.1 6379 2
# 连接主节点的密码 # 格式:sentinel auth-pass <master-name> <password> sentinel auth-pass local-master redis
# master在连续多长时间无法响应PING指令后,就会主观判定节点下线,默认是30秒 # 格式:sentinel down-after-milliseconds <master-name> <milliseconds> sentinel down-after-milliseconds local-master 30000
|
编辑 redis-sentinel-2.conf
和 redis-sentinel-3.conf
,分别修改监听端口号为 26380
和 26381
,其余部分不变。
配置及启动容器
编写 docker-compose.yml
这里继续使用 docker-compose
管理容器。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
| ---
version: '3'
services: redis-sentinel-1: image: redis container_name: redis-sentinel-1 restart: always network_mode: host volumes: - ./redis-sentinel-1.conf:/usr/local/etc/redis/redis-sentinel.conf environment: TZ: "Asia/Shanghai" sysctls: net.core.somaxconn: '511' command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"] redis-sentinel-2: image: redis container_name: redis-sentinel-2 restart: always network_mode: host volumes: - ./redis-sentinel-2.conf:/usr/local/etc/redis/redis-sentinel.conf environment: TZ: "Asia/Shanghai" sysctls: net.core.somaxconn: '511' command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"] redis-sentinel-3: image: redis container_name: redis-sentinel-3 restart: always network_mode: host volumes: - ./redis-sentinel-3.conf:/usr/local/etc/redis/redis-sentinel.conf environment: TZ: "Asia/Shanghai" sysctls: net.core.somaxconn: '511' command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"]
|
启动容器
这里同样使用 docker-compose up -d
启动容器,启动日志中可以看到哨兵开始监控 Master 节点,以及哨兵完成互相发现。
1 2 3 4 5 6 7 8 9
| redis-sentinel-2 | 1:X 11 Nov 2019 14:33:06.871 # +monitor master local-master 127.0.0.1 6379 quorum 2 redis-sentinel-2 | 1:X 11 Nov 2019 14:33:08.996 * +sentinel sentinel 3dc4e0bff631b994a492d51e99a7ebc48e35a054 127.0.0.1 26381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:33:06.990 # +monitor master local-master 127.0.0.1 6379 quorum 2 redis-sentinel-3 | 1:X 11 Nov 2019 14:33:07.001 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:33:07.010 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:33:08.876 * +sentinel sentinel 6f646433feb264b582ffa73b5d6bed6626b97966 127.0.0.1 26380 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:33:08.968 * +sentinel sentinel c3b07d8c4ac3686511e436e71043a615e9b1d420 127.0.0.1 26379 @ local-master 127.0.0.1 6379 redis-sentinel-1 | 1:X 11 Nov 2019 14:33:06.948 # +monitor master local-master 127.0.0.1 6379 quorum 2 redis-sentinel-1 | 1:X 11 Nov 2019 14:33:08.997 * +sentinel sentinel 3dc4e0bff631b994a492d51e99a7ebc48e35a054 127.0.0.1 26381 @ local-master 127.0.0.1 6379
|
然后使用 redis-cli
连接到哨兵节点,连接成功后,可以使用 info sentinel
检查哨兵的信息。
1 2 3 4 5 6 7 8
| 127.0.0.1:26379> info sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=local-master,status=ok,address=127.0.0.1:6379,slaves=2,sentinels=3
|
其中,sentinel_masters:1
说明这个哨兵在监控一个 master,最后一行中写明了 master0
这个节点别名为 local-master
,状态为 OK
,地址是 10.1.0.2:6379
,有 2 个从节点,并有 3 个哨兵在监控。
测试一下
哨兵光是启动了还是不够的,还需要测试一下当被监控节点下线之后,哨兵是否能作出反应。
我先停掉一个从节点,redis-server-slave-2,等了 30 秒后,三个哨兵主观认为 redis-server-slave-2 下线。
1 2 3
| redis-sentinel-2 | 1:X 11 Nov 2019 14:37:42.232 # +sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:37:42.290 # +sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-1 | 1:X 11 Nov 2019 14:37:42.291 # +sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379
|
重新启动 redis-server-slave-2 后,三个哨兵节点都宣布不再主观认为该节点下线。
1 2 3 4 5 6
| redis-sentinel-1 | 1:X 11 Nov 2019 14:40:19.160 * +reboot slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-1 | 1:X 11 Nov 2019 14:40:19.243 # -sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-2 | 1:X 11 Nov 2019 14:40:19.403 * +reboot slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:40:19.161 * +reboot slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:40:19.242 # -sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-2 | 1:X 11 Nov 2019 14:40:19.502 # -sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379
|
这次我停掉主节点,并经过 30 秒后,哨兵输出了一大堆日志,不要紧,我们一边看一边解读:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
| redis-sentinel-1 | 1:X 11 Nov 2019 14:44:11.639 # +sdown master local-master 127.0.0.1 6379 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:11.695 # +sdown master local-master 127.0.0.1 6379 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:11.752 # +new-epoch 1 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:11.755 # +vote-for-leader 3dc4e0bff631b994a492d51e99a7ebc48e35a054 1 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:11.758 # +odown master local-master 127.0.0.1 6379 #quorum 3/2 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:11.759 # Next failover delay: I will not start a failover before Mon Nov 11 14:50:11 2019 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.661 # +sdown master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.746 # +odown master local-master 127.0.0.1 6379 #quorum 2/2 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.746 # +new-epoch 1 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.747 # +try-failover master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.749 # +vote-for-leader 3dc4e0bff631b994a492d51e99a7ebc48e35a054 1 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.755 # c3b07d8c4ac3686511e436e71043a615e9b1d420 voted for 3dc4e0bff631b994a492d51e99a7ebc48e35a054 1 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.756 # 6f646433feb264b582ffa73b5d6bed6626b97966 voted for 3dc4e0bff631b994a492d51e99a7ebc48e35a054 1 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:11.753 # +new-epoch 1 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:11.754 # +vote-for-leader 3dc4e0bff631b994a492d51e99a7ebc48e35a054 1 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.826 # +elected-leader master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.832 # +failover-state-select-slave master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.894 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.895 * +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:11.971 * +failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ local-master 127.0.0.1 6379 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:12.436 # +config-update-from sentinel 3dc4e0bff631b994a492d51e99a7ebc48e35a054 127.0.0.1 26381 @ local-master 127.0.0.1 6379 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:12.436 # +switch-master local-master 127.0.0.1 6379 127.0.0.1 6380 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:12.437 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6380 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:12.439 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:12.434 # +config-update-from sentinel 3dc4e0bff631b994a492d51e99a7ebc48e35a054 127.0.0.1 26381 @ local-master 127.0.0.1 6379 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:12.435 # +switch-master local-master 127.0.0.1 6379 127.0.0.1 6380 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:12.435 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6380 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:12.437 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:12.372 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:12.373 # +failover-state-reconf-slaves master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:12.433 * +slave-reconf-sent slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:12.753 * +slave-reconf-inprog slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:12.920 # -odown master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:13.825 * +slave-reconf-done slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:13.883 # +failover-end master local-master 127.0.0.1 6379 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:13.883 # +switch-master local-master 127.0.0.1 6379 127.0.0.1 6380 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:13.884 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ local-master 127.0.0.1 6380 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:13.885 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-2 | 1:X 11 Nov 2019 14:44:42.446 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-1 | 1:X 11 Nov 2019 14:44:42.465 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-3 | 1:X 11 Nov 2019 14:44:43.887 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380
|
首先,三台哨兵都宣布 Master 节点主观下线。
因为在配置文件中,我们指定了当最少 2 个哨兵认为 Master 节点失效后就会开始选举 (就是 quorom),所以哨兵 2 提出选举新的 Master 节点。
接下来,哨兵将开始投票,从 Slave 节点中选举出新的 Master 节点。在达成一致后,被选举的 Slave 节点将成为新的 Master 节点,其配置文件将会被改写,来让这个变动永久生效。
然后,哨兵会通知这个集群的其他节点来加入新的 Master,包括挂掉的那个之前的 Master。
这样就完成了一次 failover 切换。
此时,如果重启之前的 Master 节点,哨兵会发现节点上线,并不再主观认为该节点下线。但是,现在这个节点已经变成了一个 Slave 节点。
1 2 3
| redis-sentinel-1 | 1:X 11 Nov 2019 14:56:32.936 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-2 | 1:X 11 Nov 2019 14:56:33.202 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380 redis-sentinel-3 | 1:X 11 Nov 2019 14:56:33.707 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ local-master 127.0.0.1 6380
|
参考文档
- Sentinel, Docker, NAT, and possible issues - Redis Sentinel Documentation
系列博文