Network documentation and monitoring are topics that never lose interest to me. Over the years I worked with many products, Nedi, Observium, Librenms,, NetBox, Icinga, NetShot, Smokeping to name a few. Each product has its strengths and weaknesses that in some cases are nothing more than the aspects on which the manufacturer has decided to concentrate development more. NetShot has compliance tests easy to write and verify, Smokeping is easy to setup and focused on monitor network and services latency.
My home network is connected to the Internet via radio bridge. The reason is ADSL services were not available a couple of years ago, now it is but since the radio bridge is cheap and works (most of the time) I really don’t care to change. Some time ago I started to notice downtime of the Internet connection, I sent a few emails to the wisp but the answer was always the same: “we don’t see a problem right now”.
On a customer’s network we noticed that the Internet facing router reboots because of a software error. We stumbled on this issue by chance, just because one of the reboots was during a videoconference. Nobody noticed the problem before and they really don’t know if the problem was there since the installation of the router a couple of years ago. They have Nagios running to monitor the network, so I’ve configured it to monitor the router uptime too.