Groovy process poller
From OpenNMS
I was tasked to monitor when a process was down and to also detect if there were too many instances of a process running. I wrote a perl script to do this but it read in a file that had the names and ips of the hosts to poll. I wanted to remove the dependency on files and for it to be more scaleable so I wrote a Groovy script to do it.
The groovy script will snmp poll devices that have net-snmp loaded on them and that the snmpd.conf is configured to monitor one or more processes. If there is an error condition (ie the process is not running or there are too many of them running) , the script will generate an event in ONMS with the nodeid and a description of the error. Or if you prefer, it will generate a syslog message at the priority level of "local0.warning" (default). Just comment out (with "//") (or remove) the section you do not want to use.
An example configuration for Net-Snmp to monitor the "ntp" and "cron" processes would be;
"proc ntp 1 1" "proc cron 6 1"
This would mean there should be at least one ntp process running and a max of 1. The cron process should have at least 1 running with a max of 6. The info needed will be in the "prTable". Specifically the "prErrMessage" for each process configured.
There are a couple of things to do to set ONMS up.
1. Add 'NETSNMPPROCMON' to the categories table.
Log into the postgres database and run the sql statement below. Keep in mind to change the categoryid number to one that is not already being used.
insert INTO categories VALUES (10,'NETSNMPPROCMON',' used to id the servers that have Net-SNMP locally configured to monitor a process');
The values to be inserted (in order) are categoryid, categoryname and categorydescription.
Or the category can be added using the GUI. Go to "Admin", "Manage Categories" and add it there.
2. Once the new category is in that table, it will then need to be added to the "category_node" table.
This table tags a node to a category and it does this by attaching the nodeid to a categoryid. Run the sql statement below keeping the same categoryid number you added in the above sql.
insert INTO category_node VALUES (10,345)
The values to be inserted (in order) are categoryid and nodeid. This command will have to be run for each server that will be polled.
Or all of the devices can be added to the category using the GUI. Go to "Admin", "Manage Categories" and "Edit" the category.
The Groovy script is below.
#!/usr/bin/env groovy
import org.opennms.netmgt.snmp.*;
import org.opennms.netmgt.config.*;
import groovy.sql.*;
import groovy.xml.MarkupBuilder;
/////////
class MyTracker extends ColumnTracker {
Closure processor;
public MyTracker(SnmpObjId base, Closure c) {
super(base)
processor = c;
}
protected void storeResult(SnmpObjId base, SnmpInstId inst, SnmpValue val) {
processor.call(base, inst, val)
}
}
// System.setProperty("org.opennms.snmp.strategyClass","org.opennms.netmgt.snmp.joesnmp.JoeSnmpStrategy");
SnmpPeerFactory.init()
// define conection to ONMS dbase
def sql = Sql.newInstance("jdbc:postgresql://127.0.0.1:5432", "opennms","opennms", "org.postgresql.Driver")
List ips=sql.rows("select DISTINCT b.nodeid, nodelabel, a.ipaddr from node AS b CROSS JOIN ipinterface AS a CROSS JOIN category_node AS c
CROSS JOIN categories AS d where b.nodeid = a.nodeid and issnmpprimary = 'P' and c.nodeid = a.nodeid and c.categoryid = d.categoryid
and d.categoryname = 'NETSNMPPROCMON'")
//Lets go thru the list of data and do something with each
ips.each {
nodename = it.nodelabel
def config = SnmpPeerFactory.getInstance().getAgentConfig(InetAddress.getByName(it.ipaddr));
SnmpObjId system = SnmpObjId.get(".1.3.6.1.4.1.2021.2.1.101")
//the above oid is to poll the net-snmp 'prErrMessage' info in the process entry table
ColumnTracker tracker = new MyTracker(system) { base, inst, val ->
if(val =~ /running/) {
// the next 2 lines generate a syslog msg
// def command = "logger -p local0.warning -t opennms PROCESSERROR $nodename has an error of $val" // Create the String
//def proc = command.execute() // Call *execute* on the string
/////////////
// the next section will generate an event in ONMS
//////////////
Socket socket = new Socket(127.0.0.1, 5817);
socket.outputStream.withWriter { out ->
// System.out.withWriter { out ->
def xml = new MarkupBuilder(out);
xml.log {
events {
event {
uei("uei.opennms.org/nodes/processError")
source("sendEvent-groovy")
time(new Date())
nodeid("3")
parms {
parm {
parmName("nodelabel")
value(encoding:'text', "$nodename")
}
parm {
parmName("procerror")
value(encoding:'text', "$val")
}
}
}
}
}
}
}
else
{
// do nothing because its empty
}
}
def walker = SnmpUtils.createWalker(config, "system", tracker)
walker.start();
walker.waitFor();
}
The groovy script just needs to be launched via cron every 30 minutes (for example) to look for the status of processes not running or there are too many of them running. I havent messed with having ONMS launch it (automation?).
Again, the dependency is on Net-snmp being installed and configured to monitor processes on a server.
This is based off one of Matt B's groovy scripts so kudos to him for sharing. The next adaptation of this will be to have ONMS have all the data and not rely on Net-Snmp to be configured to monitor processes. ONMS will have the process name and the min/max number of those processes that can be running for each server.
Comments welcome.









New Pages