JVM Monitoring using SNMP

From OpenNMS
Jump to: navigation, search

Until we get JMX monitoring working for OpenMBeans, here are some quick notes on how to collect and graph memory management data from a Sun Java 5 VM Using SNMP. This assumes that you've already got a working SNMP agent on the machine. There is also more than one way to do this, but this one works for us.

Configure the JVM and SNMP agent to expose the JVM Management MIB

Configure SNMP support on some non standard port on the JVM

Add:

-Dcom.sun.management.snmp.port=9004

to your JVM Options.

Configure access control for the JVM

cp $JAVA_HOME/jre/lib/management/snmp.acl.template $JAVA_HOME/jre/lib/management/snmp.acl

Edit $JAVA_HOME/jre/lib/management/snmp.acl to enable:

   acl = {
    {
      communities = public
      access = read-only
      managers = localhost
    }

chown/chmod $JAVA_HOME/jvm/java-1.5.0-sun-1.5.0.10/jre/lib/management/snmp.acl appropriately.

Configure the existing agent to proxy requests for the JVM-MANAGEMENT-MIB

Configure the existing snmp agent to proxy requests for the JVM Management MIB to the JVM agent. Edit snmpf.conf (or snmpd.local.conf, depending on how the snmp agent is configured) thus:

proxy -v 2c -c public localhost:9004 .1.3.6.1.4.1.42

Bounce the JVM and the snmp agent and you should be in business:

Configure Data Collection

edit datacollection-config.xml thus:


Add a custom resource for JVM memory managers and memory pools

   <resourceType name="jvmMemManagerIndex" label="JVM GC Stats" resourceLabel="${jvmMemManagerName} (index ${index})">
     <persistenceSelectorStrategy class="org.opennms.netmgt.collectd.PersistAllSelectorStrategy"/>
     <storageStrategy class="org.opennms.netmgt.dao.support.IndexStorageStrategy"/>
   </resourceType>
   <resourceType name="jvmMemPoolIndex" label="JVM Memory Pool Stats" resourceLabel="Memory Pool - ${jvmMemPoolName} (index ${index})">
    <persistenceSelectorStrategy class="org.opennms.netmgt.collectd.PersistAllSelectorStrategy"/>
    <storageStrategy class="org.opennms.netmgt.dao.support.IndexStorageStrategy"/>
  </resourceType>

Add a group to collect JVM stats

   <group name="jvm" ifType="ignore">
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.11" instance="0" alias="jvmHeapUsed" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.12" instance="0" alias="jvmHeapCommitted" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.13" instance="0" alias="jvmHeapMax" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.21" instance="0" alias="jvmNonHeapUsed" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.22" instance="0" alias="jvmNonHeapCommitted" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.23" instance="0" alias="jvmNonHeapMax" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.3.1" instance="0" alias="jvmThreadCount" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.4.11" instance="0" alias="jvmRTUptimeMs" type="Gauge64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.100.1.2" instance="jvmMemManagerIndex" alias="jvmMemManagerName" type="string" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.101.1.2" instance="jvmMemManagerIndex" alias="jvmMemGCCount" type="Counter64" />
     <mibObj oid=".1.3.6.1.4.1.42.2.145.3.163.1.1.2.101.1.3" instance="jvmMemManagerIndex" alias="jvmMemGCTimeMs" type="Counter64" />
     <mibObj oid="1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.2" instance="jvmMemPoolIndex" alias="jvmMemPoolName" type="string" />
     <mibObj oid="1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.10" instance="jvmMemPoolIndex" alias="jvmMemPoolInitSize" type="Gauge64" />
     <mibObj oid="1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.11" instance="jvmMemPoolIndex" alias="jvmMemPoolUsed" type="Gauge64" />
     <mibObj oid="1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.12" instance="jvmMemPoolIndex" alias="jvmMemPoolCommit" type="Gauge64" />
     <mibObj oid="1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.13" instance="jvmMemPoolIndex" alias="jvmMemPoolMaxSize" type="Gauge64" />
   </group>

Include this group in a systemdef

We put it in our Net-SNMP and UCD-SNMP collections, eg:

     <systemDef name="Net-SNMP">
       <sysoidMask>.1.3.6.1.4.1.8072.3.</sysoidMask>
       <collect>
         <includeGroup>mib2-host-resources-system</includeGroup>
         <includeGroup>mib2-host-resources-memory</includeGroup>
         <includeGroup>net-snmp-disk</includeGroup>
         <includeGroup>net-snmp-proc</includeGroup>
         <includeGroup>ucd-loadavg</includeGroup>
         <includeGroup>ucd-memory</includeGroup>
         <includeGroup>ucd-sysstat</includeGroup>
         <includeGroup>jvm</includeGroup>
       </collect>
     </systemDef>

Restart to start collecting.

Configure Graphing

Edit snmp-graph.properties

jvm.heap, jvm.nonheap,jvm.threads, jvm.uptime, jvm.gc, jvm.mempool \

And further down, configure the graphs.

report.jvm.heap.name=JVM Heap Memory
report.jvm.heap.columns=jvmHeapUsed, jvmHeapCommitted, jvmHeapMax
report.jvm.heap.type=nodeSnmp
report.jvm.heap.command=--title="JVM Heap Memory" \
 DEF:used={rrd1}:jvmHeapUsed:AVERAGE \
 DEF:comm={rrd2}:jvmHeapCommitted:AVERAGE \
 DEF:max={rrd3}:jvmHeapMax:AVERAGE \
 AREA:used#0000ff:"Used     " \
 GPRINT:used:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:used:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:used:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:comm#00ff00:"Committed" \
 GPRINT:comm:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:comm:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:comm:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:max#ff0000:"Max           " \
 GPRINT:max:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:max:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:max:MAX:"Max  \\: %5.2lf %s\\n"

report.jvm.nonheap.name=JVM Non-Heap Memory
report.jvm.nonheap.columns=jvmNonHeapUsed, jvmNonHeapCommitted, jvmNonHeapMax
report.jvm.nonheap.type=nodeSnmp
report.jvm.nonheap.command=--title="JVM Non-Heap Memory" \
 DEF:used={rrd1}:jvmNonHeapUsed:AVERAGE \
 DEF:comm={rrd2}:jvmNonHeapCommitted:AVERAGE \
 DEF:max={rrd3}:jvmNonHeapMax:AVERAGE \
 AREA:used#0000ff:"Used     " \
 GPRINT:used:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:used:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:used:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:comm#00ff00:"Committed" \
 GPRINT:comm:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:comm:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:comm:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:max#ff0000:"Max          " \
 GPRINT:max:AVERAGE:" Avg  \\: %5.2lf %s" \
 GPRINT:max:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:max:MAX:"Max  \\: %5.2lf %s\\n"
 
report.jvm.threads.name=JVM Threads
report.jvm.threads.columns=jvmThreadCount
report.jvm.threads.type=nodeSnmp
report.jvm.threads.command=--title="JVM Thread Count" \
 DEF:threads={rrd1}:jvmThreadCount:AVERAGE \
 LINE2:threads#0000ff:"Threads" \
 GPRINT:threads:AVERAGE:" Avg \\: %8.2lf %s" \
 GPRINT:threads:MIN:"Min  \\: %8.2lf %s" \
 GPRINT:threads:MAX:"Max  \\: %8.2lf %s\\n"
 
report.jvm.uptime.name=JVM Uptime 
report.jvm.uptime.columns=jvmRTUptimeMs
report.jvm.uptime.type=nodeSnmp
report.jvm.uptime.command=--title="JVM Uptime" \
 --vertical-label Hours \
 DEF:time={rrd1}:jvmRTUptimeMs:AVERAGE \
 CDEF:hours=time,3600000,/ \
 LINE2:hours#0000ff:"JVM Uptime (Hours)" \
 GPRINT:hours:AVERAGE:"Avg  \\: %8.1lf %s" \
 GPRINT:hours:MIN:"Min  \\: %8.1lf %s" \
 GPRINT:hours:MAX:"Max  \\: %8.1lf %s\\n"

report.jvm.gc.name=JVM GC Time
report.jvm.gc.columns=jvmMemGCCount, jvmMemGCTimeMs
report.jvm.gc.type=jvmMemManagerIndex
report.jvm.gc.propertiesValues=jvmMemManagerName
report.jvm.gc.command=--title="JVM GC Time {jvmMemManagerName}" \
 DEF:gccount={rrd1}:jvmMemGCCount:LAST \
 DEF:gctimems={rrd2}:jvmMemGCTimeMs:LAST \
 CDEF:gctimes=gctimems,1000,/ \
 LINE2:gctimes#ff0000:"Time (s) :" \
 GPRINT:gctimes:LAST:" Current  \\: %8.2lf %s" \
 GPRINT:gctimes:MIN:"Min  \\: %8.2lf %s" \
 GPRINT:gctimes:MAX:"Max  \\: %8.2lf %s\\n" 
report.jvm.mempool.name=JVM Memory Pool
report.jvm.mempool.columns=jvmMemPoolInitSize, jvmMemPoolUsed, jvmMemPoolMaxSize, jvmMemPoolCommit
report.jvm.mempool.type=jvmMemPoolIndex
report.jvm.mempool.propertiesValues=jvmMemPoolName
report.jvm.mempool.command=--title="JVM Memory Pool - {jvmMemPoolName}" \
 --vertical-label="Bytes" \
 --base=1024 \
 DEF:used={rrd1}:jvmMemPoolUsed:AVERAGE \
 DEF:max={rrd2}:jvmMemPoolMaxSize:AVERAGE \
 DEF:init={rrd3}:jvmMemPoolInitSize:AVERAGE \
 DEF:commit={rrd4}:jvmMemPoolCommit:AVERAGE \
 AREA:used#00a876:"Used     " \
 GPRINT:used:AVERAGE:" Average  \\: %5.2lf %s" \
 GPRINT:used:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:used:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:max#FF5900:"Max      " \
 GPRINT:max:AVERAGE:" Average  \\: %5.2lf %s" \
 GPRINT:max:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:max:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:commit#1047a9:"Committed" \
 GPRINT:commit:AVERAGE:" Average  \\: %5.2lf %s" \
 GPRINT:commit:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:commit:MAX:"Max  \\: %5.2lf %s\\n" \
 LINE2:init#000000:"Initial  " \
 GPRINT:init:AVERAGE:" Average  \\: %5.2lf %s" \
 GPRINT:init:MIN:"Min  \\: %5.2lf %s" \
 GPRINT:init:MAX:"Max  \\: %5.2lf %s\\n"

The "JVM GC Time" entry above may not work, I had to edit the DEF lines to look like this. OpenNMS 1.7

DEF:gccount={rrd1}:jvmMemGCCount:AVERAGE \
DEF:gctimems={rrd2}:jvmMemGCTimeMs:AVERAGE \

Supported Versions

Confirmed to work with 1.6.1