• Top
  • Comment
  • Reply

A guide to setup a mon Daemon in Linux

So you have setup some services and now you want to be able to make the ping is up and also ensure that the webserver is responding. But hang on a minute you only want to check if the webserver is responding only when the router is pinging. And also I dont care weather my router is down. And who lets me know when they come back up. You can see slowly every thing starts becoming complicated.

This is where our mon friend comes in. So why use mon?

  • Easy to install
  • Ready to go off the shelf monitoring (ping, httpd, ftp)
  • Easy to add custom monitors (Check if a file exists on a server. etc)
  • Easy to setup alerts (Email, File. etc.)

Lets just get our hands dirty, and install the mon deamon. For this tutorial I will using Fedora.

yum install mon

The above will install the binaries and configuration files as required. Lets get our hands dirty and start by looking at the configuration file in /etc/mon/mon.cf. This is the main template that defines what our monitors will be. So lets dig in. There are three main parts to the mon.cf file.

  1. Global configuration
  2. Group Definition
  3. Watch Definition

Global Configuration

cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/lib/mon/state.d
logdir      = /var/lib/mon/log.d
dtlogfile   = /var/lib/mon/log.d/downtime.log
alertdir    = /usr/lib64/mon/alert.d
mondir      = /usr/lib64/mon/mon.d
maxprocs    = 20
histlength  = 100 
randstart   = 60s 
authtype    = pam 
userfile    = /etc/mon/userfile

The above shows where all the files that the daemon uses are located, you can explore these directories and dig a little deeper to discover more, or use man mon. Files I recommend you investigate are mondir and alertdir.

The mondir stores the monitoring scripts. These are pre-loaded scripts to handle common monitors like ping and httpd checks however you can add your own, in any language. The most interesting part of the monitoring scripts is that the daemon only cares about the return code of the script. So if you create a custom monitor script remember exit 0 is success and exit NONZERO_UNSIGNED_INT is a failure.

The alertdir stores the alert handling function, So for example the alert that sends out the email can be found here. You can customize or add your own alert handling here.

Lets start setting up some monitoring.

Group Definition

Before you can define monitors you need to set watch-groups, watch-groups are just a way of grouping multiple server types together, you can define these as you wish. An example group can be defined as: hostgroup . For Example

hostgroup router router.localhost
hostgroup webservers web1.shahmirj.com web2.shahmirj.com

In our case we have a server sitting in a isolated network pretending to see if web1.shahmirj.com and web2.shahmirj.com are responding to http requests. We want to first ensure we have a internet connection, therefore we set up a watchgroup called local which monitors our server's access to the internet and a second external watchgroup called webservers which monitors the servers that are dear to us. Lets look at the router watchgroup example:

Watch Definition

watch router
    service ping
        interval 1m
        monitor ping.monitor
        period wd {Mon-Sun}

watch router what hostgroups would you like to monitor

A service ping is a name of the service group, you can assign multiple monitors to one service.

The interval 1m specify when to run run the service, in this case it is set to 1 minute. The values here are defaulted to seconds when only an integer is given, where as an integer with m or h denotes minutes or hours.

monitor ping.monitor - Is the actual monitor script that is run. This script is included with the mon daemon, which can be found at /usr/lib64/mon/mon.d/. You can change this directory using the mondir config variable mentioned above. If you browse through the directory you will see a whole host of other scripts that can be handy.

period wd {Mon-Sun} allows you to set when to monitor watch-group router. In our example its set to every day of the week.

Underneath our watch router we now manage our webservers, So our definition will look as follows:

watch webservers
    service ping
        interval 15m
        failure_interval 1m
        monitor ping.monitor
        depend local:ping
        period wd {Mon-Sun}
            alertafter 1
            alert mail.alert -f mon@shahmirj.com me@shahmirj.com
            alertevery 1h
            upalert mail.alert -f mon@shahmirj.com -S "Ping is responding" me@shahmirj.com

failure_interval 1m sets the monitoring interval when the monitor fails.

depend router:ping lets you use dependencies in monitoring, the format is depend <watchgroup>:<service>. In our case we make sure that the router is pinging before continuing to monitor the web servers.

alertafter 1 sets how-many times should a monitor fail in succession before an alert is sent out.

alert, define which method show an alert be raised. In our case we send an email to me@shahmirj.com. The -f variable represents the from field when sending an email. By default no alert is sent till an alert is specified

upalert is a helpful trigger that is used for succeeding monitors which have previously failed accordingly to the rules, It has the same rules as alert.

Debugging and Deployment

To test if all is working I usually use the following command, and manually break a few things to check the link. If you are sending email alerts make sure the alert email is getting through your email spam filters.

$ mon -d

Once all is well start the daemon, and set it to start up on boot.

$ chkconfig mon on
$ service mon start

In conclusion

Mon daemon is one of the most powerful monitoring scripts out there, from the above you can see how a simple monitor can be tuned to the exact needs and deliver a peace of mind to your systems.

I will recommend to read the following to get the full power of the mon daemon

Any questions just ask

By

20th Sep 2012
© 2011 Shahmir Javaid - http://shahmirj.com/blog/29

Nash

28th Oct 2012

God help me, I put aside a whole aftrneoon to figure this out.



Back to Top
All content is © copyrighted, unless stated otherwise.
Subscribe, @shahmirj, Shahmir Javaid+