1. Redis Sentinel部署
Redis-sentinel是Redis实例的监控管理、通知和实例失效备援服务,是Redis集群的管理工具。在一般的分布式中心节点数据库中,Redis-sentinel的作用是中心节点的工作,监控各个其他节点的工作情况并且进行故障恢复,来提高集群的高可用性。
1.1. 部署环境
本文使用一台机器不同商品来部署Sentinel环境,一共三台Redis服务,一个master,两个slave,还有三台Sentinel服务,详细如下表:
IP | 端口号 | 角色 |
---|---|---|
127.0.0.1 | 6379 | Redis master |
127.0.0.1 | 6380 | Redis slave |
127.0.0.1 | 6381 | Redis slave |
127.0.0.1 | 26379 | Redis sentinel |
127.0.0.1 | 26380 | Redis sentinel |
127.0.0.1 | 26381 | Redis sentinel |
由于超多数sentinal同意后才会执行故障自动转移,所以建议使用3个以上sentinel服务,并且使用奇数个哨兵。
1.2. 环境准备
在部署redis sentinel环境之前,确保已安装了Redis,本文使用的是redis-3.2.9版本。
部署一主多从的过程,可以参考Redis 主从配置
配置好master/slave的配置文件后,目录结构如下:
$ cd /usr/local
$ ls -al redis*
lrwxr-xr-x 1 root wheel 29 5 19 13:19 redis -> /usr/local/source/redis-3.2.9
lrwxr-xr-x 1 root wheel 23 10 11 11:33 redis-6380 -> source/redis-3.2.9-6380
lrwxr-xr-x 1 root wheel 23 10 11 11:33 redis-6381 -> source/redis-3.2.9-6381
1.3. 配置sentinel.conf
1.3.1. 复制redis-sentinel执行文件到bin目录
依次进入目录redis、redis-6380、redis-6381,把src目录下的sentinel执行文件redis-sentinel
复制到bin目录下:
$ sudo cp src/redis-sentinel bin/
1.3.2. 复制sentinel.conf配置文件到etc目录
依次进入目录redis、redis-6380、redis-6381,把根目录下的sentinel.conf配置文件复制到etc目录下:
$ sudo cp sentinel.conf etc/
1.3.3. 更改sentinel.conf文件
依次进入目录redis、redis-6380、redis-6381,更改sentinel.conf
配置,以redis目录下为例,更改内容如下:
# 禁用保护模式,支持localhost多端口启用sentinel部署
protected-mode no
# sentinel 端口号
port 26379
# port 26380
# port 26381
# 守护模式启动
daemonize yes
# log文件路径
logfile "/usr/local/redis/sentinel.log"
# logfile "/usr/local/redis-6380/sentinel.log"
# logfile "/usr/local/redis-6381/sentinel.log"
# 设置监控名为mymaster的master服务
sentinel monitor mymaster 127.0.0.1 6379 2
详细的配置参数可以参考redis sentinel
1.4. 启动Redis和Sentinel
配置完redis和sentinel后,按顺序启动各个角色,启动顺序为:Master->Slave->Sentinel
➜ local sudo redis/bin/redis-server redis/etc/redis.conf
➜ local sudo redis-6380/bin/redis-server redis-6380/etc/redis.conf
➜ local sudo redis-6381/bin/redis-server redis-6381/etc/redis.conf
➜ local sudo redis/bin/redis-sentinel redis/etc/sentinel.conf
➜ local sudo redis-6380/bin/redis-sentinel redis-6380/etc/sentinel.conf
➜ local sudo redis-6381/bin/redis-sentinel redis-6381/etc/sentinel.conf
➜ local ps -ef | grep redis
0 14392 1 0 5:46下午 ?? 0:00.10 redis/bin/redis-server 127.0.0.1:6379
0 14399 1 0 5:47下午 ?? 0:00.06 redis-6380/bin/redis-server 127.0.0.1:6380
0 14407 1 0 5:47下午 ?? 0:00.05 redis-6381/bin/redis-server 127.0.0.1:6381
0 14416 1 0 5:47下午 ?? 0:00.07 redis/bin/redis-sentinel *:26379 [sentinel]
0 14423 1 0 5:48下午 ?? 0:00.04 redis-6380/bin/redis-sentinel *:26380 [sentinel]
0 14430 1 0 5:48下午 ?? 0:00.01 redis-6381/bin/redis-sentinel *:26381 [sentinel]
1.5. 使用redis-cli查看sentinel信息
可以使用redis-cli连接sentinel服务,使用INFO命令查看sentinel信息:
$ bin/redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=127.0.0.1:6379,slaves=2,sentinels=3
1.6. 查看sentinel日志
日志路径配置在每个redis根目录下的sentinel.log文件,输入tail -f sentinel.log即可监控日志文件内容变化,
查看sentinel日志:
$ tail -n 1000 sentinel.log
15425:X 12 Oct 20:02:38.907 # Sentinel ID is ee7f7c4459580d25386f5f7ec4bb5353bb6ff12c
15425:X 12 Oct 20:02:38.907 # +monitor master mymaster 127.0.0.1 6379 quorum 2
15425:X 12 Oct 20:02:38.908 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
15425:X 12 Oct 20:02:38.908 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
15425:X 12 Oct 20:02:41.960 * +sentinel sentinel a43dc8f2aa2c616b2078c35df9e4de7188ea1341 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
15425:X 12 Oct 20:02:42.998 * +sentinel sentinel 7e5d7cf0de4e6aa1e9ecd54f264183b5c21e0085 127.0.0.1 26381 @ mymaster 127.0.0.1 6379
1.7. 查看sentinel.conf
查看sentinel.conf配置文件,发现此文件已被重写,加入了部分更改:
sentinel myid ee7f7c4459580d25386f5f7ec4bb5353bb6ff12c
# Generated by CONFIG REWRITE
sentinel known-slave mymaster 127.0.0.1 6381
sentinel known-slave mymaster 127.0.0.1 6380
sentinel known-sentinel mymaster 127.0.0.1 26380 a43dc8f2aa2c616b2078c35df9e4de7188ea1341
sentinel known-sentinel mymaster 127.0.0.1 26381 7e5d7cf0de4e6aa1e9ecd54f264183b5c21e0085
sentinel current-epoch 0
其中:
- +slave :一个新的从服务器已经被 Sentinel 识别并关联。
- +sentinel :一个监视给定主服务器的新 Sentinel 已经被识别并添加。
1.8. 模拟master故障
在终端直接关闭master服务:
$ bin/redis-cli shutdown
过一段时间,查看sentinel.log文件的变化:
6380日志:
15429:X 12 Oct 20:12:22.579 # +sdown master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.664 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2
15429:X 12 Oct 20:12:22.664 # +new-epoch 1
15429:X 12 Oct 20:12:22.665 # +try-failover master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.666 # +vote-for-leader a43dc8f2aa2c616b2078c35df9e4de7188ea1341 1
15429:X 12 Oct 20:12:22.668 # 7e5d7cf0de4e6aa1e9ecd54f264183b5c21e0085 voted for a43dc8f2aa2c616b2078c35df9e4de7188ea1341 1
15429:X 12 Oct 20:12:22.668 # ee7f7c4459580d25386f5f7ec4bb5353bb6ff12c voted for a43dc8f2aa2c616b2078c35df9e4de7188ea1341 1
15429:X 12 Oct 20:12:22.725 # +elected-leader master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.725 # +failover-state-select-slave master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.784 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.784 * +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:22.862 * +failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:23.041 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:23.041 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:23.139 * +slave-reconf-sent slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:23.804 # -odown master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:24.079 * +slave-reconf-inprog slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:24.079 * +slave-reconf-done slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:24.130 # +failover-end master mymaster 127.0.0.1 6379
15429:X 12 Oct 20:12:24.130 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
15429:X 12 Oct 20:12:24.131 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
15429:X 12 Oct 20:12:24.132 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
15429:X 12 Oct 20:12:54.193 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
6381日志:
15433:X 12 Oct 20:12:22.552 # +sdown master mymaster 127.0.0.1 6379
15433:X 12 Oct 20:12:22.667 # +new-epoch 1
15433:X 12 Oct 20:12:22.668 # +vote-for-leader a43dc8f2aa2c616b2078c35df9e4de7188ea1341 1
15433:X 12 Oct 20:12:23.139 # +config-update-from sentinel a43dc8f2aa2c616b2078c35df9e4de7188ea1341 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
15433:X 12 Oct 20:12:23.140 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
15433:X 12 Oct 20:12:23.140 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
15433:X 12 Oct 20:12:23.140 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
15433:X 12 Oct 20:12:53.165 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
6379日志:
15425:X 12 Oct 20:12:22.667 # +new-epoch 1
15425:X 12 Oct 20:12:22.668 # +vote-for-leader a43dc8f2aa2c616b2078c35df9e4de7188ea1341 1
15425:X 12 Oct 20:12:23.139 # +config-update-from sentinel a43dc8f2aa2c616b2078c35df9e4de7188ea1341 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
15425:X 12 Oct 20:12:23.140 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
15425:X 12 Oct 20:12:23.140 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
15425:X 12 Oct 20:12:23.140 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
15425:X 12 Oct 20:12:53.153 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
+sdown 表示哨兵主观认为数据库下线 +odown 表示哨兵客观认为数据库下线 +try-failover 表示哨兵开始进行故障恢复 +failover-end 表示哨兵完成故障修复,其中包括了领头哨兵的选举、备选从数据库的选择等等较为复杂的过程 +switch-master表示主数据库从51服务器迁移到52服务器 +slave列出了新的主数据库的2个从数据库,而哨兵并没有彻底清除51服务器的实力信息,这是因为停止的实例有可能会在将来恢复,哨兵会让其重新加入进来
查看此时的sentinel信息:
bin/redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=127.0.0.1:6380,slaves=2,sentinels=3
1.9. 模拟恢复
$ redis sudo bin/redis-server etc/redis.conf
查看sentinel日志:
# 6381日志
15433:X 12 Oct 20:15:59.978 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
# 6380日志
15429:X 12 Oct 20:15:59.287 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
15429:X 12 Oct 20:16:09.251 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
查看一下sentinel状态信息:
bash