环境介绍
Master: 192.168.1.218 redis,keepalived
Slave: 192.168.1.219 redis,keepalived
Virtural IP Address (VIP): 192.168.1.220
以下Master表示192.168.1.218这台主机,Slave表示192.168.1.219这台主机;master/slave表示keepalived/redis的role。
设计思路:
通过keepalived的自定义脚本功能监控本机的redis服务状态,当监控脚本检测到redis服务出现异常时,则将本机的keepalived关闭,同时这会导致master/backup角色的变化,而keepalived在角色变化时也会触发一些机制执行相关脚本,这就为我们改变redis的master/slave状态提供了机会,这样做的目的是为了是redis的master/slave的数据保持一致。
在keepalived+redis的使用过程中有四种情况:
- 一种是keepalived挂了,同时redis也挂了,这样的话直接VIP飘走之后,是不需要进行redis数据同步的,因为redis挂了,你也无法去master上同步,不过会损失已经写在master上却还没同步到slave上面的这部分数据。
- 另一种是keepalived挂了,redis没挂,这时候VIP飘走后,redis的master/slave还是老的对应关系,如果不变化的话会把数据写入redis slave中,从而不会同步到master上去,这就要借助监控脚本反转redis的master/slave关系。这时候就要预留一点时间进行数据同步,然后反转master/slave。
- 还有一种是keepalived没挂,redis挂了,这时候根据监控脚本会检测到redis挂了,将本地的keepalived关闭,将虚拟IP漂移到另外一台服务器上。由另外一台备机承接redis业务。
- 最后一种是keepalived没挂,redis也没挂,什么都不用操作。
本文的实验环境四种情况都适合,第一种是不需要同步数据的,脚本会默认去同步数据,但是其实是不会成功的。脚本主要是用来处理第二和第三种情况的。
安装好的环境
keepalived
redis
redis配置
#创建redis主目录
mkdir -p /usr/local/redis/{conf,logs}
#将可执行文件拷贝到相应的目录,注意分号别忘记
find src/ \( -perm -0001 \) -type f -exec cp -a -R -p {} /usr/local/redis/bin \;
#创建redis启动脚本
vi /usr/local/redis/redis-start.sh
####以下是master上的配置,slave上的配置只需修改对应的IP地址。
#!/bin/bash
RPATH=/usr/local/redis
KPATH=/usr/local/keepalived
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP=192.168.1.218 #这里是master的ip,如果是slave服务器上对应slave的ip
REMOTEIP=192.168.1.219 #这里是slave的ip,如果是在slave服务器上对应master的ip
$RPATH/bin/redis-server $RPATH/conf/redis.conf
if [ “$?” == “0” ];then
echo “[INFO]`date +%F/%H:%M:%S` :$LOCALIP redis start successful.” >> $LOGFILE
else
echo “[ERROR]`date +%F/%H:%M:%S` :$LOCALIP redis start error.” >> $LOGFILE
fi
#创建redis关闭脚本
vi /usr/local/redis/redis-stop.sh
####以下为master上的配置,slave上的配置只需修改对应的IP地址。
#!/bin/bash
RPATH=/usr/local/redis
KPATH=/usr/local/keepalived
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP=192.168.1.218 #根据实际配置,当前ip
REMOTEIP=192.168.1.219 #master就是slave的ip,slave是master的ip,反着来的
kill -9 `ps -ef|grep '/bin/redis-server'|grep -v grep|awk' {print $2}' `
if ["$?" == "0"];then
echo "[INFO]`date +%F/%H:%M:%S` :$LOCALIP redis shutdown completed!">> $LOGFILE
else
echo "[INFO]`date +%F/%H:%M:%S` :$LOCALIP redis is not started." >> $LOGFILE
fi
#创建redis配置文件
cp -a -R -p redis.conf /usr/local/redis/conf/redis.conf
#修改redis.conf对应配置项:
vi /usr/local/redis/conf/redis.conf
#以下为改动部分,其他的按照实际生产环境进行调整
daemonize yes
pidfile /usr/local/redis/redis.pid
#bind 192.168.1.218 #暂时注释,方便测试
timeout 30
loglevel verbose #实际生产环境可用notice,此处是为了详细查看各种输出细节
logfile "/usr/local/redis/logs/redis.log"
dir /usr/local/redis/
appendonly yes
#修改redis的属主和权限
chmod -R 750 /usr/local/redis/
配置keepalived
cd /usr/local/keepalived
#将keepalived.conf备份然后删掉:
mv /usr/local/keepalived/etc/keepalived/keepalived.conf /usr/local/keepalived/etc/keepalived/keepalived.conf-bak
#在Master:192.168.1.218上创建如下配置文件(可根据实际情况调整):
vi /usr/local/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {
script “/usr/local/keepalived/etc/keepalived/scripts/redis_check.sh”
#如果脚本执行结果非0,并且weight配置的值小于0,则优先级相应的减少;如果脚本执行结果为0,并且weight配置的值大于0,则优先级相应的增加;其他情况,维持原本prority的优先级。
#weight -20
interval 10 #设置脚本执行的频率。10秒一次
}
vrrp_instance VI_1 {
state MASTER #主服务器设置成MASTER,SLAVE设置成BACKUP
interface eth0 #根据服务器网卡来设置,必须和服务器网卡一致,不然配置失败
virtual_router_id 51
priority 100 #master的这个值必须比slave高
#设置不抢占。在priority值比较高的服务器上设置即可。priority值比较低的服务器启动时,发现值高的服务器为master,自动不抢占。
nopreempt
#advert_int的作用是巡检的次数。keepalived默认是在启动完成后3秒向state:MASTER切换。若此处设置成2,则是2*3=6秒后才开启切换。
advert_int 1
authentication {
auth_type PASS
auth_pass redis #redis密码,没有不管
}
virtual_ipaddress {
192.168.1.220 #vip
}
track_script {
chk_redis
}
notify_master /usr/local/keepalived/etc/keepalived/scripts/master.sh
notify_backup /usr/local/keepalived/etc/keepalived/scripts/backup.sh
notify_fault /usr/local/keepalived/etc/keepalived/scripts/fault.sh
notify_stop /usr/local/keepalived/etc/keepalived/scripts/stop.sh
}
#在Slave:192.168.1.219上创建如下配置文件(可根据实际情况调整):
vi /usr/local/keepalived/etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {
script “/usr/local/keepalived/etc/keepalived/scripts/redis_check.sh”
#如果脚本执行结果非0,并且weight配置的值小于0,则优先级相应的减少;如果脚本执行结果为0,并且weight配置的值大于0,则优先级相应的增加;其他情况,维持原本prority的优先级。
#weight -20
interval 10 #设置脚本执行的频率。10秒一次
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
garp_master_delay 10
virtual_router_id 51
priority 90
#nopreempt
#advert_int的作用是巡检的次数。keepalived默认是在启动完成后3秒向state:MASTER切换。若此处设置成2,则是2*3=6秒后才开启切换。
advert_int 1
authentication {
auth_type PASS
auth_pass redis
}
virtual_ipaddress {
192.168.1.220
}
track_script {
chk_redis
}
#当keepalived切换成master时,会触发执行master.sh
notify_master /usr/local/keepalived/etc/keepalived/scripts/master.sh
#当keepalived切换成slave时,会触发执行slave.sh
notify_backup /usr/local/keepalived/etc/keepalived/scripts/backup.sh
#当keepalived出错时,会触发执行fault.sh
notify_fault /usr/local/keepalived/etc/keepalived/scripts/fault.sh
#当keepalived停止时,会触发执行stop.sh
notify_stop /usr/local/keepalived/etc/keepalived/scripts/stop.sh
}
#指定keepalived的日志文件(我没有配置成功)
vi /usr/local/keepalived/etc/sysconfig/keepalived
#KEEPALIVED_OPTIONS=”-D”
KEEPALIVED_OPTIONS=”-D -d -S 0″
#redhat6.0以下服务器修改/etc/syslog.conf,redhat6.0以上(包括6.0)服务器修改/etc/rsyslog.conf,新增以下内容:
#Save keepalived message to keepalived.log
local0.* /usr/local/keepalived/logs/keepalived.log
#重启日志服务
service rsyslog restart
#将文件拷贝到相应的位置,必须在keepalived目录下,这一步能复制配置文件,生成启动服务
cp -r * /
#在Master和Slave上创建监控Redis的相关脚本脚本,以下脚本都是master上的配置,slave上只需修改相应的IP地址。
#创建文件夹
mkdir /usr/local/keepalived/etc/keepalived/scripts
mkdir /usr/local/keepalived/logs
#创建redis_check.sh文件========================================================
vi /usr/local/keepalived/etc/keepalived/scripts/redis_check.sh
内容:
#!/bin/bash
KPATH=/usr/local/keepalived
RPATH=/usr/local/redis
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP=”192.168.1.218″
REMOTEIP=”192.168.1.219″
PORT=”6379″
PID=$$
ALIVE=`$REDISCLI PING`
if ["$ALIVE" == "PONG"]; then
echo "[INFO]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP local redis is health.">> $LOGFILE
exit 0
else
echo "[ERROR]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP local redis is not health.">> $LOGFILE
#当发现本地redis无法连接时,等待一秒后再进行一次检查。若恢复,则提示;若仍无法连接,则关闭本地keepalived,将虚拟ip漂移到另外一台服务器上。
sleep 1
ALIVE1=`$REDISCLI PING`
if [ "$ALIVE1" == "PONG"];then
echo "[NOTICE]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP local redis become health.">> $LOGFILE
exit 0
else
echo "[ERROR]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP local redis is error.">> $LOGFILE
echo "[ERROR]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP shutdown local keepalived.">> $LOGFILE
/etc/init.d/keepalived stop
if ["$?" != "0"];then
echo "[ERROR]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP keepalived shutdown error.">> $LOGFILE
else
echo "[INFO]`date +'%Y-%m-%d:%H:%M:%S'` :$LOCALIP keepalived shutdown completed.">> $LOGFILE
fi
exit 1
fi
fi
创建master.sh文件======================================================
vi /usr/local/keepalived/etc/keepalived/scripts/master.sh
内容:
#!/bin/bash
KPATH=/usr/local/keepalived
RPATH=/usr/local/redis
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP="192.168.1.218"
REMOTEIP="192.168.1.219"
PORT="6379"
PID=$$
#当此服务器的keepalived恢复成master时,即虚拟IP切换到本机时,将本机的redis切换成role:master
echo "[WARM]———–keepalived change to master,change local redis to master—————">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave]">> $LOGFILE
#先切换成role:slave
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] Run 'SLAVEOF $REMOTEIP $PORT'">> $LOGFILE
$REDISCLI SLAVEOF $REMOTEIP $PORT >> $LOGFILE 2>&1
#同步数据
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] wait 10 sec for data sync from old master">> $LOGFILE
sleep 10
#等待10秒(此时间要根据实际业务需要进行调整),待数据同步完,再切换成role:master
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] data rsync from old mater ok…">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] Run slaveof no one,close master/slave">> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] wait other slave connect….">> $LOGFILE
echo "————————————-complete!——————————————">> $LOGFILE
#创建backup.sh文件=====================================================
vi /usr/local/keepalived/etc/keepalived/scripts/backup.sh
内容:
#!/bin/bash
KPATH=/usr/local/keepalived
RPATH=/usr/local/redis
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP="192.168.1.218"
REMOTEIP="192.168.1.219"
PORT="6379"
PID=$$
#当此服务器的keepalived恢复成slave时,即虚拟IP切换到其他服务器时,将本机redis切换成role:slave
echo "[WARM]————keepalived change to slave,change local redis to slave—————-">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master]">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] Being slave state…">> $LOGFILE 2>&1
#切换时,等待10秒,让对方同步数据(此时间要根据实际业务需要进行调整)
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] wait 10 sec for data sync from old master">> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] data rsync from old mater ok…">> $LOGFILE
#等数据同步完,再切换成role:slave
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] Run 'SLAVEOF $REMOTEIP $PORT'">> $LOGFILE
$REDISCLI SLAVEOF $REMOTEIP $PORT >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] slave connect to $REMOTEIP $PORT ok…">> $LOGFILE
echo "————————————-complete!——————————————">> $LOGFILE
#创建stop.sh文件==================================================
vi /usr/local/keepalived/etc/keepalived/scripts/stop.sh
内容:
#!/bin/sh
KPATH=/usr/local/keepalived
RPATH=/usr/local/redis
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP="192.168.1.218"
REMOTEIP="192.168.1.219"
PORT="6379"
PID=$$
#当主服务器的keepalived停止时,将本机redis切换成role:slave
echo "[ERROR]—————–keepalived stop,change local redis to slave———————">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master]">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] Being slave state…">> $LOGFILE 2>&1
#切换时,等待10秒,让对方同步数据(此时间要根据实际业务需要进行调整)
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] wait 10 sec for data sync from old master">> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] data rsync from old mater ok…">> $LOGFILE
#等数据同步完,再切换成role:slave
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] Run 'SLAVEOF $REMOTEIP $PORT'">> $LOGFILE
$REDISCLI SLAVEOF $REMOTEIP $PORT >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] slave connect to $REMOTEIP $PORT ok…">> $LOGFILE
echo "————————————-complete!——————————————">> $LOGFILE
#创建fault.sh文件===================================================
vi /usr/local/keepalived/etc/keepalived/scripts/fault.sh
内容:
#!/bin/bash
KPATH=/usr/local/keepalived
RPATH=/usr/local/redis
REDISCLI=$RPATH/bin/redis-cli
LOGFILE=$KPATH/logs/redis-state.log
LOCALIP="192.168.1.218"
REMOTEIP="192.168.1.219"
PORT="6379"
PID=$$
#当此服务器的keepalived出错时,将本机redis切换成role:slave
echo "[ERROR]—————keepalived is fault,change local redis to slave——————-">> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master]">>$LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] Being slave state…">> $LOGFILE 2>&1
#切换时,等待10秒,让对方同步数据(此时间要根据实际业务需要进行调整)
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] wait 10 sec for data sync from old master” >> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[master] data rsync from old mater ok…">> $LOGFILE
#等数据同步完,再切换成role:slave
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] Run 'SLAVEOF $REMOTEIP $PORT'">> $LOGFILE
$REDISCLI SLAVEOF $REMOTEIP $PORT >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$PID|state:[slave] slave connect to $REMOTEIP $PORT ok…">> $LOGFILE
echo "————————————-complete!——————————————">> $LOGFILE
修改监控脚本的权限:
chmod -R 750 /usr/local/keepalived/etc/keepalived/scripts/
系统测试
注意:
(1).在keepalived.conf配置文件中,将keepalived双机 都设置成BACKUP.同时在218上设置了nopreempt,即恢复时不抢占。而规划中是将218作为master。所以在启动过程中要遵循以下顺序:先启动218上的keepalived,等待数据同步完成后,再启动219上的keepalived.
(2).在keepalived的巡检脚本redis_check.sh中加入了状态切换的监控脚本。在master.sh中设置了当keepalived切换成master,会先将redis切换成slave进行同步数据,再切换回master。所以在启动keepalived之前,要保证Master和Slave上redis的数据是一致的,这样先启动redis的master那台的keepalived,虽然redis master会连接到redis slave同步数据,但是两边数据在刚开始的时候是一致的,并不会产生什么问题。
(3).在实际生产环境中需要修改防火墙策略,开放相应的端口。在此直接先将防火墙关闭:service iptables stop。
以下为各种测试场景和输出结果:
—————————————–初始环境————————————————–
设定一下初始环境:
—-启动218和219的redis: /usr/local/redis/redis-start.sh

—-启动218的keepalived: service keepalived start;先不启动219的keepalived.
在218上执行tail –f /usr/local/keepalived/logs/keepalived.log,可看到keepavlived切换成master state(配置文件中是设置state:backup),且绑定了VIP。

查看218Master:redis的日志,可以看到redis切换的过程如下:

—-启动219Slave的keepalived,并查看redis的日志,可以看到redis的状态变成了slave:

—————————————–初始环境————————————————–
—————————————–设计思路3————————————————-
—-模拟设计思路3,将218Master的redis进程kill掉:
此时218的keepalived会被停止,如下图:

219的keepalived会正确切换成State:Master,VIP完成漂移,如下图:

218的redis监控日志如下:

219的redis监控日志如下,显示了219已切换成master,保证了业务(当然此处218在内存中未写入文件的数据会丢失):

—-模拟218从故障中恢复:
因为在发现故障时,会将218上的keepalived关闭,因此恢复时,需要先启动218的redis,然后再启动218的keepalived: 查看218的keepalived日志,218的keepalived直接进入state:backup,不会造成业务的来回切换:

查看218的redis日志,218的redis启动后,会切换成已存在redis服务器的备机。

综上所示,设计思路3测试成功。
—————————————–设计思路3————————————————-
—————————————–设计思路2————————————————-
—-先设置成初始环境,再模拟设计思路2,将218的keepalived进程kill掉(service keepalived stop):
查看218的redis监控日志:

查看219的keepalivd日志,说明keepalived正常切换了:

查看219的redis监控日志,可以看到redis完成了主从切换:

—-模拟218从keepalived故障中恢复(只需要先kill所有keepalived进程后正常启动),执行service keepalived start:
查看219的keepalived的日志,可以看到keepalived的state为backup,不会造成VIP的漂移:

查看218的redis监控日志,可

查看218的redis运行日志,可以看到redis恢复为slave身份,不会造成业务切换:

综上所示,设计思路2测试成功。
相关指令
启动redis:/usr/local/redis/redis-start.sh
关闭redis:/usr/local/redis/redis-stop.sh
启动keepalived:service keepalived start
关闭keepalived:service keepalived stop
查看进程:ps -ef|grep xxxxx
本文参照地址:http://luyx30.blog.51cto.com/1029851/1350832