Monitoring Keepalived with Nagios

This is one of the cases where the passive check of the Nagios monitoring system comes in handy as we need to be informed immediately when the Keepalived changes state. So you will need an NSCA system implemented with your Nagios configuration.

Studying the Keepalived configuration we found that the authors have implemented the option to run a script that when the state of the VRRP changes. The script I’m referring to is run under the “notify” option.

Here is what I’m talking about (from the Keepalived configuration):

           # for ANY state transition.
           # "notify" script is called AFTER the
           # notify_* script(s) and is executed
           # with 3 arguments provided by keepalived
           # (ie don't include parameters in the notify line).
           # arguments
           # $1 = "GROUP"|"INSTANCE"
           # $2 = name of group or instance
           # $3 = target state of transition
           #     ("MASTER"|"BACKUP"|"FAULT")
           notify /path/notify.sh

My idea is to create a small script that will be added to the configuration of the VRRP instance and which will provide all the info to the NSCA client. This script will pass the status information to the Nagios installation via the NSCA daemon installed on the Nagios server.

The notify.sh script should look like this:

#!/bin/sh
HOST="localhost"
KEEPHOST="127.0.0.1"
SERVICE="Keepalived service status"
#read the last state
LASTCHECKF="/tmp/keep_nsca_hist.tmp"
LASTCHECK="echo $LASTCHECKF"
echo -e "$HOSTt$SERVICEt1tKeepalived $1 - $2 is transitioning from the $LASTCHECK to the $3 state" | /opt/nagios/bin/send_nsca -H $KEEPHOST -c /etc/nagios/send_nsca.cfg
#store the new state to the temp file
echo "$3"> $LASTCHECKF
exit 0

I did not have the time to test the script nor to implement further checks. This article is only with an informational scope. So happy testing!

If you found this info useful, please leave a comment 🙂 .

Mihai out!

Copyright (c) 2011 Mihai Radoveanu. All Rights Reserved.Note: Copying this article to your website is strictly NOT allowed.

3 thoughts on “Monitoring Keepalived with Nagios

  1. Olof Mattsson

    Hi

    I used this notify script and modified it to be a little more flexible. I check if keepalived is moving from the expected master (set warning) and when it moves back I set OK. This way I don’t need to reset the passive check when all is OK again.
    Tested on Debian Squeeze with Nagios3.

    #!/bin/sh
    HOST=`hostname`
    KEEPHOST=NAGIOS_SERVER_IP
    SERVICE=”LVS_Passive”
    #read the last state
    LASTCHECKF=”/tmp/keep_nsca_hist.tmp”
    LASTCHECK=`cat $LASTCHECKF`
    grep -q MASTER /etc/keepalived/keepalived.conf
    ISMASTER=$?
    if [ “$ISMASTER” = “0” ]; then
    MASTERHOST=$HOST
    fi
    if [ “${HOST}” = “${MASTERHOST}” ] && [ “$3” = “MASTER” ]; then
    STATE=0
    else
    STATE=1
    fi
    echo “$HOST\t$SERVICE\t$STATE\tKeepalived $1 – $2 is transitioning from the $LASTCHECK to the $3 state” | /usr/sbin/send_nsca -H $KEEPHOST -c /etc/send_nsca.cfg
    #store the new state to the temp file
    echo $3 > $LASTCHECKF
    exit 0

    Reply
  2. Cybertinus

    Thanks for the explanation. It helped me to build my own script. I’m using Zabbix for monitoring, not Nagios, but the 3 arguments Keepalived sends to the script where exactly what I needed to know to create my own script. Thanks for displaying them here. I couldn’t find them in my keepalived documentation or configfiel.
    So, with the following script you can do the same as above, but then with Zabbix:
    #!/bin/bash

    ##########
    # CONFIG #
    ##########

    HOSTNAME=”$(hostname -f)”
    ZABBIX_SERVER=”zabbixserver.example.tld”
    ZABBIX_ITEM=”keepalived.status”

    #################
    # ACTUAL SCRIPT #
    #################

    # $1 = “GROUP”|”INSTANCE”
    # $2 = name of group or instance
    # $3 = target state of transition
    status=$3

    /usr/bin/zabbix_sender -z ${ZABBIX_SERVER} -s ${HOSTNAME} -k ${ZABBIX_ITEM} -o ${status} > /dev/null

    The needed item in Zabbix needs to be of type “zabbix trapper”, and a “numeric (float)” as data type.

    Reply

Leave a Reply