On a customer’s network we noticed that the Internet facing router reboots because of a software error. We stumbled on this issue ┬áby chance, just because one of the reboots was during a videoconference. Nobody noticed the problem before and they really don’t know if the problem was there since the installation of the router a couple of years ago. They have Nagios running to monitor the network, so I’ve configured it to monitor the router uptime too.

It’s really and easy task, just monitor the correct OID.

define service {
	service_description cisco uptime
	use generic-service
	host_name router_saiv
	check_command check_cisco_uptime!.!public!1801:!950:

define command{
	command_name check_cisco_uptime
	command_line $USER1$/check_snmp -H $HOSTADDRESS$ -o $ARG1$ -C $ARG2$ -w $ARG3$ -c $ARG4$

and set a proper range on Nagios as shown HERE .

This is the final result:

A cool Cacti graph can be useful too, the procedure is well described HERE .

Monitoring is a big topic and an important part is to decide what to monitor, device’s uptime should be always included.