Plugin: check_mem - Linux Memory Usage

A plugin written in perl to monitor and check thresholds for memory based on the output of the 'free -mt' command.

Ref: Nagios Exchange - check_mem page

Installation

  1. Copy the file to your /usr/local/nagios/libexec directory of the host you are monitoring
  2. Set the file mode to 755
  3. Add a line to your nrpe.cfg file and restart the service.
  4. command[check_mem]=/usr/local/nagios/libexec/check_mem -w 80,20 -c 95,50
  5. Add the NRPE check to the appropriate configuration file on your Nagios server like:
define service{
        use                     servicetemplate2   
        hostgroup_name          linux-servers
        service_description     tmp
        check_command           check_nrpe!check_tmp
}

Command Line Syntax

# /usr/local/nagios/libexec/check_mem -w 50,20 -c 80,50
<b>WARNING: Memory Usage (W> 50, C> 80): 72% <br>Swap Usage (W> 20, C> 50): 0%</b> \
|MemUsed=72%;50;80 SwapUsed=0%;20;50

Display the Data

To add this to your "map" file for NagiosGraph, append the code to capture the data like below:

# Service type: check_mem
#   check command: check_nrpe!check_mem -w 50,10 -c 80,25
#   output: <b>CRITICAL: Memory Usage (W> 80, C> 95): 100% <br>Swap Usage (W> 20, C> 50): 0%</b>
#   perfdata: MemUsed=100%;80;95 SwapUsed=0%;20;50
/perfdata:.*MemUsed=(\d+)%;(\d+);(\d+).*?SwapUsed=(\d+)%;(\d+);(\d+)/
and push @s, [ memory,
       [ ramuse, GAUGE, $1 ],
       [ swapuse, GAUGE, $4 ] ];

Dumping Linux Buffer Cache

TOP screenshot

A useful command to check linux system resources is "top".  However with the buffer cache you may see almost no available memory (see attached screenshot).  But how can that be?  All you have running may be a java app, apache, and a few other services.  There is no way that should be using ALL of that RAM.  In my case, I have the Nagios "check_mem" plugin querying for available memory and throwing alerts quite regularly.

The "free -mt" command can show you how much memory is cached.  That eases my mind a bit.  While googling about buffer cache, I stumbled upon this article:

http://devcs.blogspot.com/2007/12/linux-buffer-cache-how-to-disable-it.html

This article gives a good rundown of what the buffer cache is all about.  Also it mentions a nice little trick to dump the entire cache.

echo 1 > /proc/sys/vm/drop_caches

VOILA!  Cache dumped and Nagios is happy.  Just dumping the cache shouldn't be taken lightly as it MAY have some adverse effects depending on your server.  However the cache should slowly start to build back up. 

Looking into the actual check_mem script, you can have it exclude the buffers in the calculation of free memory.  Check this line and make sure the value is "1":

my $DONT_INCLUDE_BUFFERS = 1;