This is one of the cases where the passive check of the Nagios monitoring system comes in handy as we need to be informed immediately when the Keepalived changes state. So you will need an NSCA system implemented with your Nagios configuration.
Studying the Keepalived configuration we found that the authors have implemented the option to run a script that when the state of the VRRP changes. The script I’m referring to is run under the “notify” option.
Here is what I’m talking about (from the Keepalived configuration):
# for ANY state transition.# "notify" script is called AFTER the# notify_* script(s) and is executed# with 3 arguments provided by keepalived# (ie don't include parameters in the notify line).# arguments# $1 = "GROUP"|"INSTANCE"# $2 = name of group or instance# $3 = target state of transition# ("MASTER"|"BACKUP"|"FAULT")notify /path/notify.sh
My idea is to create a small script that will be added to the configuration of the VRRP instance and which will provide all the info to the NSCA client. This script will pass the status information to the Nagios installation via the NSCA daemon installed on the Nagios server.
The notify.sh script should look like this:
#!/bin/shHOST="localhost"KEEPHOST="127.0.0.1"SERVICE="Keepalived service status"#read the last stateLASTCHECKF="/tmp/keep_nsca_hist.tmp"LASTCHECK="echo $LASTCHECKF"echo -e "$HOSTt$SERVICEt1tKeepalived $1 - $2 is transitioning from the $LASTCHECK to the $3 state" | /opt/nagios/bin/send_nsca -H $KEEPHOST -c /etc/nagios/send_nsca.cfg#store the new state to the temp fileecho "$3"> $LASTCHECKFexit 0
I did not have the time to test the script nor to implement further checks. This article is only with an informational scope. So happy testing!
If you found this info useful, please leave a comment 🙂 .
Mihai out!
Copyright (c) 2011 Mihai Radoveanu. All Rights Reserved.Note: Copying this article to your website is strictly NOT allowed.
Hi
I used this notify script and modified it to be a little more flexible. I check if keepalived is moving from the expected master (set warning) and when it moves back I set OK. This way I don’t need to reset the passive check when all is OK again.
Tested on Debian Squeeze with Nagios3.
#!/bin/sh
HOST=`hostname`
KEEPHOST=NAGIOS_SERVER_IP
SERVICE=”LVS_Passive”
#read the last state
LASTCHECKF=”/tmp/keep_nsca_hist.tmp”
LASTCHECK=`cat $LASTCHECKF`
grep -q MASTER /etc/keepalived/keepalived.conf
ISMASTER=$?
if [ “$ISMASTER” = “0” ]; then
MASTERHOST=$HOST
fi
if [ “${HOST}” = “${MASTERHOST}” ] && [ “$3” = “MASTER” ]; then
STATE=0
else
STATE=1
fi
echo “$HOST\t$SERVICE\t$STATE\tKeepalived $1 – $2 is transitioning from the $LASTCHECK to the $3 state” | /usr/sbin/send_nsca -H $KEEPHOST -c /etc/send_nsca.cfg
#store the new state to the temp file
echo $3 > $LASTCHECKF
exit 0
Thanks for sharing.
When I have some time, I will give it a try and post here the result!
Mihai
Thanks for the explanation. It helped me to build my own script. I’m using Zabbix for monitoring, not Nagios, but the 3 arguments Keepalived sends to the script where exactly what I needed to know to create my own script. Thanks for displaying them here. I couldn’t find them in my keepalived documentation or configfiel.
So, with the following script you can do the same as above, but then with Zabbix:
#!/bin/bash
##########
# CONFIG #
##########
HOSTNAME=”$(hostname -f)”
ZABBIX_SERVER=”zabbixserver.example.tld”
ZABBIX_ITEM=”keepalived.status”
#################
# ACTUAL SCRIPT #
#################
# $1 = “GROUP”|”INSTANCE”
# $2 = name of group or instance
# $3 = target state of transition
status=$3
/usr/bin/zabbix_sender -z ${ZABBIX_SERVER} -s ${HOSTNAME} -k ${ZABBIX_ITEM} -o ${status} > /dev/null
The needed item in Zabbix needs to be of type “zabbix trapper”, and a “numeric (float)” as data type.