Monitoring MVP

Asset management, documentation and monitoring are important parts of any IT Ops team but at the same time they are boring, repetitive, error prone.

Last summer I spent some time investigating the possibility to integrate open source projects I like to create a Minimum Viable Product for asset management/monitoring and learn something during the process.

This is far from being something complete or production ready but I think it worth to share before it get lost in some abandonware repository on my laptop. It is more and idea than a product.

The pieces

When I need to deploy something on low budget or I need some customization I rely on LibreNMS or NeDi for network discovery and Icinga2 for monitoring.

In a previous post I discussed how to integrate NeDi and LibreNMS to import hosts from the former to the later.

Recently I started to experiment NetBox and I liked the idea of having Single Source of Truth for all the network.

So I put it all together to see how they could work as an integrated discovery, asset management, SoT and monitoring solution.

The MVP

After some thinkering with API I started with a simple idea of workflow:

  1. discover the network with LibreNMS
  2. export devices information from LibreNMS to NetBox
  3. complete devices information in NetBox (manually with GUI or scripted via API from other data sources)
  4. use NetBox to create hosts to be monitored in Icinga2

Integrating LibreNMS and NetBox

SoT and dynamic discovery are two concepts that don’t play well together. The idea of SoT is to match reality with intent. Having a discovery mechanism that changes the SoT without human intervention breaks the concept of truth itself.

For this reason there should be an additional phase between discovery and SoT (I like to call it Lies) to verify if reality matches the truth. This would allow an operator to make a decision to fix truth or reality based on the actual desire.

Example: SoT includes serial number of devices. If a device is replaced with another one with the same model/serial LibreNMS will notice and will add it to Lies. The SoT will be updated only if the operator acknowledges the change.

On the other direction, a change on SoT can be a Lie that will be automatically cleared when LibreNMS discovery reads the same information. Example: a new IP address is defined for a server in SoT. Thiscould be integrated with any automation tool (Ansible, Napalm) to translate SoT update to actions and use the discovery mechanism to validate the change.

I haven’t implemented any ot this, just wondering if it makes sense.

In my implementation I used the NetBox API to pull objects from LibreNMS and create devices on NetBox. Where possible I mapped some information between the two so devices have name, serial, main IP address already filled.

The code of this part is a mix of curl and Python, I’ll not share ;-) but you got the idea. Contact me if you want to see how messy it is.

Integrating of NetBox and Icinga2

NetBox uses the concept of Platforms to categorize devices.

I took advantage of this information to map HostGroups in the Icinga2 configuration.

This is how it works: in NEtBox each device is assigned to a platform, the export script maps platform to Icinga2 host variable hosts.vars.role.

The role is then matched in groups.conf on the Icinga2 server using the Group Assign feature:

object HostGroup "ACCESS SWITCH" {   
  display_name = "ACCESS SWITCH"   
  assign where "access-switch" in host.vars.role
}  

object HostGroup "CORE SWITCH" {   
  display_name = "CORE SWITCH"   
  assign where "core-switch" in host.vars.role
} 

object HostGroup "DISTRIBUTION SWITCH" {   
  display_name = "DISTRIBUTION SWITCH"   
  assign where "distribution-switch" in host.vars.role
} 

The final result is hosts on Icinga2 are assigned to different groups based on NetBox platforms:

10/28/2018 5:08:06 PM g

I did the a similar configuration for services. If a device is tagged with m_ssh in Netbox

it will be assigned to service group ssh

apply Service "ssh" {
  import "generic-service"
  check_command = "ssh"
  assign where "ssh" in host.vars.services
}

Remember the idea is that only service and groups assignments will be edited on Icinga2 configuration (easy to manage in a git). Icinga2 hosts will be created only by the export script. New hosts will be added only in NetBox with the desired platform and tags.

A similar mechanism is used t export only the hosts tagged with monitor from NetBox to Icinga2. A cleanup script verifies that if a host is monitored by Icinga2 is missing the monitor tag it will be removed from monitoring.

From now it’s just a matter of mapping tags or other host values in NetBox to Icinga2 service assignments. I mapped m_http for http, m_https for https and more, you got it.

NOTE

Icinga2 does not automatically evaluate hosts when changes are applied via API. See issue 6576 for details.

Improvements and notes

As I said this is just a Minimum Viable Product. Code is available on GitHub if anyone is curious to see the details, please keep in mind I dedicated just a few hours of spare time for the project, code is not clean or fully tested. Use it at your own risk.

The script to export from NetBox to Icinga2 runs with cron. NetBox supports WebHooks, that would be nice improvement.

Wrap up

The design, creation and testing of the scripts gave me a chance to use and learn Python, Linux and APIs. I don’t have plans to develop further the project but things may change if opportunities arise.

I hope you enjoyed reading this post and found something useful. Feedback is welcome through the usual channels.

 
comments powered by Disqus