13.9. Monitoring SSHD

You use the SSH daemon on all of your servers for secure remote administration, so you want to set up Nagios to monitor SSH and alert you if it becomes unavailable. You also want to be able to add new servers for monitoring easily.

Start by setting it up for one server. You'll create a command definition, a host definition, and a service definition by editing commands.cfg, hosts.cfg, and services.cfg. Then, you'll be able to add new servers simply by creating new host definitions, and adding the server names to the service definition.

The default commands.cfg does not contain a command definition for SSH, so add this to commands.cfg:

	# 'check_ssh' command definition
	define command{
	        command_name    check_ssh
	        command_line    $USER1$/check_ssh -H $HOSTADDRESS$
	        }

Next, add a host definition to hosts.cfg, using your own hostname and IP address:

	# SSH servers
	define host{
	        use                     generic-host
	        host_name               server1
	        alias                   backup server
	        address                 192.168.1.25
	        check_command           check-host-alive
	        max_check_attempts      10
	        check_period           24x7
	        notification_interval   120
	        notification_period     24x7
	        notification_options    d,r
	        contact_groups          admins
	        }

Add your new server to an existing group, or create a new group for it, as this example shows:

	define hostgroup{
	        hostgroup_name   misc_servers
	        alias            Servers
	        members          server1
	        }

Now, define the SSH service in services.cfg:

	# Define a service to monitor SSH
	define service{
	        use                       generic-service
	        host_name                 server1
	        service_description       SSH
	        is_volatile               0
	        check_period              24x7
	        max_check_attempts        4
	        normal_check_interval     5
	        retry_check_interval      1
	        contact_groups            admins
	         notification_options         w,u,c,r
	        notification_interval     960
	        notification_period       24x7
	        check_command             check_ssh
	        }

Run the syntax checker, then restart Nagios:

	# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
	# /etc/init.d/nagios restart

Refresh the Nagios web interface, and you'll see the new entry's status listed as PENDING. In a few minutes, Nagios will run the new service check, and it will no longer be PENDING, but displaying status information. If you don't want to wait, go to Service Detail → SSH → Reschedule Next Service Check, and run it immediately.

If you are using ports other than port 22, use the -p option to specify the correct port.

You can use this recipe as a copy-and-paste template for most services.

Look in /usr/lib/nagios/libexec to view your available plug-ins. Run [plugin-name]--help to see the available options.

Host and service definitions have several required fields; see "Template-Based Object Configuration" (http://localhost/nagios/docs/xodtemplate.html) in your local Nagios documentation for details.