Sysadmin

Achieving high-availability with Keepalived

Failover/floating/virtual IP

A virtual IP is an IP that does not belong to a physical interface. Failover IP and floating IP are terms coined by some VPS providers to designate a virtual IP which is used to achieve high-availability: the virtual IP points to a master server and switches to a backup server in case of master failure. A failover IP or floating IP is thus a virtual IP used in that specific scenario.

VRRP and ARP

Virtual Router Redundancy Protocol (VRRP) provides applications a high-availability environment. One machine is master and the other is the backup. The backup listens for multicast packets, called VRRP advertisements, from the master. If it fails to receive the VRRP advertisements for some time, the backup machine gets promoted to master and assigns the virtual IP to itself using ARP.

Address Resolution Protocol (ARP) maps an IP address to a physical machine address (MAC address). In order for the new master server to assign the virtual IP to itself, the server will respond to the ARP request for the virtual IP (sent by some gateway), and consequently the virtual IP will now map to the new master's MAC address.

Keepalived

Keepalived is a tool that uses VRRP to provide high-availability for Linux systems. One server can be designated as master and the other as backup through a configuration file. Keepalived executes a scripts that you provide at a specified interval, which will typically monitor some services and perform any necessary health checks. If the script exits with another value than 0, the master server will switch from MASTER to FAULT state. The master server will cease emitting the VRRP advertisements and the backup server will switch from BACKUP to MASTER state.

Install Keepalived

Install dependencies:

# yum groupinstall "Development Tools"
# yum install make curl gcc openssl-devel libnl3-devel net-snmp-devel
# yum install openssl-devel libnl3-devel ipset-devel iptables-devel

Install Keepalived:

# wget http://www.keepalived.org/software/keepalived-2.0.15.tar.gz
# tar -xf keepalived*
# cd keepalived*
# ./configure
# make
# make install

Write the check script

Create the keepalived_script user as Keepalived's docs suggest:

By default the scripts will be executed by user keepalived_script if
that user exists, or if not by root, but for each script the user/group
under which it is to be executed can be specified.

There are significant security implications if scripts are executed
with root privileges, especially if the scripts themselves are modifi-
able or replaceable by a non root user. Consequently, security checks
are made at startup to ensure that if a script is executed by root,
then it cannot be modified or replaced by a non root user.

# adduser keepalived_script
# vim /usr/local/sbin/keepalived_check_script.sh
#!/bin/bash

pidof httpd >/dev/null

if [[ "$?" == "1" ]]
then
    echo 'Apache is down!' | logger -p alert -t keepalived_check_script.sh
    exit 1
fi

pidof mysqld >/dev/null

if [[ "$?" == "1" ]]
then
    echo 'MariaDB is down!' | logger -p alert -t keepalived_check_script.sh
    exit 1
fi

pidof php-fpm >/dev/null

if [[ "$?" == "1" ]]
then
    echo 'PHP FPM is down!' | logger -p alert -t keepalived_check_script.sh
    exit 1
fi

DISK_USAGE=`df /dev/vda1 | tail -1 | awk '{print $5}' | sed 's/[^0-9]*//g'`

if (( $DISK_USAGE > 95 ))
then
    echo 'Disk usage is greater than 95%!' | logger -p alert -t keepalived_check_script.sh
    exit 1
fi

FILE=/home/keepalived_script/set_fault

if [[ -e $FILE ]]
then
    echo "File ${FILE} exists!" | logger -p alert -t keepalived_check_script.sh
    exit 1
fi

exit 0

I check if the services needed to run the web applications are running and monitor available disk space. I also check if a file /home/keepalived_script/set_fault exists; creating that file allows me to lock a server in the fault state until I fix the issue.

# chown keepalived_script:keepalived_script /usr/local/sbin/keepalived_check_script.sh
# chmod ug+x /usr/local/sbin/keepalived_check_script.sh

Write the notification script

Notification scripts are scripts that are run whenever a server changes state. My script will send me an email with msmtp whenever the state changes.

# vim /usr/local/sbin/keepalived_notify_script.sh
#!/bin/bash

TYPE=$1
NAME=$2
STATE=$3

TO="sysadmin@example.com"
FROM="keepalived-servername@example.com"
SUBJECT="Keepalived instance transitioned into $STATE ($TYPE $NAME)"

printf "To: ${TO}\nFrom: ${FROM}\nSubject: ${SUBJECT}" | /usr/local/bin/msmtp $TO 2>&1 >/dev/null | logger -t keepalived_notify -p alert

exit 0
# chown keepalived_script:keepalived_script /usr/local/sbin/keepalived_notify_script.sh
# chmod 600 /usr/local/sbin/keepalived_notify_script.sh

Write the Keepalived configuration files

Master server

# mkdir /etc/keepalived
# vim /etc/keepalived/keepalived.conf
global_defs {
        script_user keepalived_script keepalived_script
        enable_script_security
}

vrrp_script chk_myscript {
        script "/usr/local/sbin/keepalived_check_script.sh"
        # execute every 2 seconds
        interval 2
        # it has to fail 2 times before switching state to fault
        fall 2
        # it has to succeed 2 times before switching state to master
        rise 2
}
vrrp_instance VI_1 {
        # The interface keepalived will manage
        interface eth1
        state MASTER
        # How often to send out VRRP advertisements
        advert_int 2
        # The virtual router id number to assign the routers to
        virtual_router_id 42
        # The priority to assign to this device.
        # This controls who will become the MASTER and BACKUP for a given
        # VRRP instance (a lower number gets less priority)
        priority 100
        unicast_src_ip 192.168.0.1
        unicast_peer {
                192.168.0.2
        }
        track_script {
                chk_myscript
        }
        notify /usr/local/sbin/keepalived_notify_script.sh
        # The vrrp instance will track both eth0 and eth1
        # (by default the vrrp instance will track its own interface).
        # If any of the tracked interfaces goes down the vrrp instance will transition to FAULT state.
        track_interface {
                eth0
        }
        # The virtual IP addresses to float between nodes.
        virtual_ipaddress {
                203.0.113.0/24 dev eth0
        }
}

unicast_src_ip should contain this server's IP while unicast_peer should contain the other server's IP. You may also just send the VRRP packets through eth0.

Note that I let Keepalived send the VRRP advertisements through the private network eth1, while the virtual IP is bound to eth0 which is public.

Backup server

# mkdir /etc/keepalived
# vim /etc/keepalived/keepalived.conf

Paste the same content but set state to BACKUP, switch the IPs in unicast_peer and unicast_peer and set a lower priority in priority.

The virtual_router_id must have the same value in the master and backup configuration files.

Create a service with systemd

# vim /etc/systemd/system/keepalived.service
[Unit]
Description=LVS and VRRP High Availability Monitor
After=network.target network-online.target syslog.target
Wants=network-online.target
ConditionFileNotEmpty=/etc/keepalived/keepalived.conf

[Service]
Type=forking
PIDFile=@PID_DIR@/run/keepalived.pid
KillMode=process
EnvironmentFile=-@sysconfdir@/sysconfig/keepalived
ExecStart=@sbindir@/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
# systemctl preset keepalived.service
# systemctl enable keepalived.service
# systemctl start keepalived.service

Useful commands for debugging

The command below lets you see the VRRP advertisements that are transmitted from the master to the backup server. If nothing prints, then something is blocking VRRP packets.

# tcpdump -n -v -i eth1 vrrp

Replace eth1 above if you specified another interface for Keepalived to manage.

Support IPv6

TODO

Comments

Comments including links will not be approved.