ZhoubaWiki:IcingaEventHandlerInstallation
Introduction to Setting up event handlers triggering on remote servers
For illustration eh_restart_service
event handler installation is described
Monitoring server part
Having check_remote
set up this way:
define command { command_name check_remote command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p $_HOSTPORT$ -u -c $ARG1$ -t 120 }
simple new service definition row is to be added like this:
# generic response code template
define service{
name response-code-template
use generic-service-charted
check_interval 0.25
retry_interval 0.25
max_check_attempts 3
flap_detection_enabled 0
register 0
}
# trackers
define service{
use response-code-template
hostgroup_name trackers
service_description Check HTTPS success response code
check_command check_https_success_response!S!/
event_handler check_remote!restart_service_tracker
}
Note that max_check_attempts
directive has to be set to have a value of more than 1
, ideally at least 3
. It is so because event handlers are executed on soft states only (1/3, 2/3) - to avoid unnecessary notifications.
Remote server part
1. Copy the event handler script (can be in any executable form) the the remote server's plugin directory (typically /usr/lib/nagios/plugins
)
2. Assign it as an NRPE plugin under tha same name as check_nrpe's parameter (in this case restart_service_tracker
) into nrpe.cfg
like this:
command[restart_service_tracker]=sudo /usr/lib/nagios/plugins/eh_restart_service -s tracker
- Note the usage of sudo prefix - it is so due to the necessity of broader privileges for services restarting.
3. Don't forget to restart the daemon afterwards
/etc/init.d/nagios-nrpe-server restart
4. Add the nagios user privilege to execute nagios plugins using sudo. /etc/sudoers snippet example follows:
# Allow members of group sudo to execute any command %sudo ALL=(ALL:ALL) NOPASSWD: ALL nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
5. Now you can issue test command to verify everything on remote server is set up OK
/usr/lib/nagios/plugins/check_nrpe -H localhost -c restart_service_tracker
- This exact event handler outputs 'OK', exits with a code of 0 and writes log entry into
/icinga/eh_restart_service.log
. Consult source code of your event handler script to evaluate it works OK;-)
6. If still unsure it works ok, manually stop the service and issue the command of point 5 again. Then see service status and calm down.