This page is under regular updates. Please check back later for more content.
Continuous Monitoring
Nagios

Nagios - Continuous Monitoring Tool

Phases of Continuous Monitoring

  1. Define - Develop a monitoring strategy
  2. Establish - How frequently you're going to monitor
  3. Implement - Install the tools
  4. Analyze data and report finding - Gather the information and visualise it in every possible way to find the and monitor the resources/infra
  5. Respond - Take actions based on the gathered information
  6. Renew and Update - Repeat the process

Overview of Nagios

  • Nagios is an open source software for Continuous Monitoring of system, network and infrastructure. It runs plugins stored on a server which is connected with a host or another server on a network or another server on your network or the internet.
  • In case of any failure such as - CI/CD pipelinr, Application failure, infrastructure failure etc. Nagios alerts about the issue so that the technical team can perform recovery process immediately.
  • It uses port number 5666 - 5668 to monitor it's client, it can be changed.

There are multiple other Monitoring tools such as -

  • Splunk
  • Cloudwatch
  • ELK
  • Sensu etc.

History

  • 1999: Ethan Gasltand developed it as a part of Netsaint Distribution.
  • 2002: Ethan rename the project to "Nagios" because of trademark issues with the name "Netsaint"
  • 2009: Nagios released it's first commercial version, Nagios XI
  • 2012: Nagios is again renamed as Nagios Core.

Why Nagios?

  • Detect all type of network or server issues
  • Helps you to find the root cause for the problem which allow you to get the permannet solution.
  • Reduce downtime
  • Active monitoring of entire infrastructure
  • Allow you to monitor and troublshoot server performance issue.
  • Automatically fix the problem (if configured)
  • Active monitoring is nothing but monitoring server will kepp on checking the client infra/resource automatically without human intervension.
  • Passive monitoring where client will inform about it's status. But if clinet get crashed Nagios server will assume there is some issue.
  • Nagios is capable of both Active and Passive monitoring.

Features of Nagios

  • Oldest and latest
  • Has it's own database and secured dashboard
  • Good log and database system
  • Informative and attractive User interface
  • Automatically sends alert if condition changes
  • Helps you to detect network errors or server crash reports
  • You can monitor the entire business process and IT infrastructure with a single pass.
  • Monitor network services like HTTP, SMTP, FTP, LDAP, IPMI, DNS etc.

Nagios Architecture

  • It is a client server architecture
  • Usually on a network, a nagios server is running on a host and plugins are running on a host and plugins are running on all the remote host which should you monitor.

How does it work?

  • Mention all the details in configuration files
  • Daemon read those details what data to be collected
  • Daemon uses NRPE (Nagios Remote Plugin Executor) to collect data from nodes (here we run NRPE agennts that are used for multiple purposes) and store int teh database.

We have multiple NRPE agents for multiple different purposes

  • Finally show everything in dashboard.

Prequisite & Installation

Following must be installed in prior to run Nagios server -

  • httpd for web interface
  • php for dashboard
  • gcc & gd for compiler, convert raw code into
  • mkfile to build
  • perl for scripting

To install and start the nagios server click here (opens in a new tab)

Main confguration file

  • It exist in /usr/local/nagios/etc/nagios.cfg
  • All monitoring thing is called services. For example if you have 2 services per node for 5 nodes, to monitor than you have total 10 services to monitor.