Monitoring Asterisk with OpenNMS

Yesterday I saw this article, which lays out a recipe for basic Asterisk monitoring using Munin, in Matt Riddell’s daily Asterisk news. OpenNMS has included out-of-the-box support for Asterisk management for some time now, but the functionality is often hidden for reasons I’ll get to in a moment. This article provides a set of steps for switching it on.

You’ll need to be running Asterisk 1.4 or 1.6 (more statistics are available via SNMP in 1.6) and OpenNMS 1.6.8 or later (earlier versions have an SNMP data collection bug; there’s a workaround, but it’s better just to upgrade).

The Munin article uses a set of Perl scripts on the Asterisk server that execute commands directly on the Asterisk CLI, parse the output, and ship the resulting data back to Munin. We tend to do things a bit differently in OpenNMS; to borrow a quote from Bobby, one of our prime directives is “thou shalt not fork.” We also tend to prefer standards-track protocols (particularly SNMP) over ad-hoc ones. The first step, then, is to get Asterisk speaking SNMP. Many people are unaware that Asterisk has included a baked-in SNMP (sub)agent since the 1.4 releases; it’s not the easiest bit of Asterisk to get working properly, and I won’t even touch that topic here since somebody else has already done a good write-up on the process. The one point on which I will diverge from that article is on the recommendation of SNMPv3 with authentication and privacy enabled; while great for security and a very good idea in production, it’s much more straightforward to troubleshoot SNMPv2c while you’re getting the solution going.

If you’re able, from the OpenNMS server, to read values from the Asterisk MIB via the SNMP agent on the Asterisk server, then you’ve done everything right on the Asterisk server. You can check this using the snmpwalk command from the Net-SNMP tools:

opennms$ snmpwalk -v2c -c public 192.168.2.5 .1.3.6.1.4.1.22736.1.1
SNMPv2-SMI::enterprises.22736.1.1.1.0 = STRING: "1.6.0.6"
SNMPv2-SMI::enterprises.22736.1.1.2.0 = Gauge32: 999999

Or with the pure-Java SNMP4J library that OpenNMS uses:

opennms$ java -cp /opt/opennms/lib/snmp4j-1.9.3c.jar \
  org.snmp4j.tools.console.SnmpRequest \
  -v 2c -c public -Ow 192.168.2.5/161 .1.3.6.1.4.1.22736.1.1
Jan 7, 2010 2:17:34 PM org.snmp4j.log.JavaLogAdapter log
INFO: UDP receive buffer size for socket 192.168.23.73/0 is set to: 65507
1.3.6.1.4.1.22736.1.1.1.0 = 1.6.0.6
1.3.6.1.4.1.22736.1.1.2.0 = 999999

Total requests sent:    1
Total objects received: 2
Total walk time:        34 milliseconds

Add the Asterisk server to OpenNMS and verify in the OpenNMS web UI that the SNMP service shows up on one or more of the Asterisk server’s interfaces. If you see a populated SNMP Attributes box on the node’s detail page:

And if resource graphs for node-level and interface-level performance data are available:

Then you’re ready to proceed with configuring OpenNMS. We’ll be editing three files (that number will drop to two in OpenNMS 1.8). Of course you’ll be making backups of each file before you begin. Start with capsd-configuration.xml; the bolded lines are what you’ll be adding. If your file isn’t exactly like this one, just add these lines above the </capsd-configuration> at the bottom of the file.

  <protocol-plugin protocol=”Windows-Task-Scheduler” class-name=”org.opennms.netmgt.capsd.plugins.Win32ServicePlugin” scan=”on”>
    <property key=”timeout” value=”2000″ />
    <property key=”retry” value=”1″ />
    <property key=”service-name” value=”Task Scheduler” />
  </protocol-plugin>

  <protocol-plugin protocol=”Asterisk_SNMP” class-name=”org.opennms.netmgt.capsd.plugins.SnmpPlugin” scan=”on”>
    <property key=”vbname” value=”.1.3.6.1.4.1.22736.1.1.1.0″ />
    <property key=”timeout” value=”2000″ />
    <property key=”retry” value=”1″ />
  </protocol-plugin>

</capsd-configuration>

The new protocol-plugin tells the OpenNMS capabilities scanner daemon, or Capsd, how to find a service called Asterisk_SNMP. We’ll use this as a marker service to get around the fact that most Asterisk servers return whatever sysObjectID their SNMP agent uses by default, thus identifying themselves as plain old Linux or FreeBSD or whatever kind of server they are but giving no immediate clue to their Asteriskness. This is the reason why the Asterisk management functionality is usually hidden.

Now we need to make an addition to collectd-configuration.xml. Again, add only the bolded lines, and if your file looks a bit different, add them between the last </package> line and the first <collector> line.

  <package name=”example1″>
    <filter>IPADDR != ’0.0.0.0′</filter>
    <include-range begin=”1.1.1.1″ end=”254.254.254.254″/>

      <service name=”SNMP” interval=”300000″ user-defined=”false” status=”on”>
        <parameter key=”collection” value=”default”/>
        <parameter key=”thresholding-enabled” value=”true”/>
      </service>

    </package>


    <package name=”asterisk-servers”>
      <filter>IPADDR != ’0.0.0.0′ &amp; isAsterisk_SNMP</filter>
      <include-range begin=”1.1.1.1″ end=”254.254.254.254″/>


      <service name=”SNMP” interval=”300000″ user-defined=”false” status=”on”>
        <parameter key=”collection” value=”asterisk”/>
        <parameter key=”thresholding-enabled” value=”true”/>
      </service>


    </package>

    <collector service=”SNMP”         class-name=”org.opennms.netmgt.collectd.SnmpCollector”/>

  </collectd-configuration>

This new package tells OpenNMS’ SNMP collector that it will be collecting an additional set of metrics from all nodes that have the Asterisk_SNMP marker service on one of their interfaces.

Finally, we’ll define that extra set of Asterisk-specific metrics in datacollection-config.xml. Once again it’s only the bolded lines that we’ll be adding. If your file differs from the unbolded context provided below, just insert the new lines between the last </snmp-collection> line and the <datacollection-config> line at the bottom of the file.

      <systemDef name=”Riverbed Steelhead WAN Accelerators”>
        <sysoid>.1.3.6.1.4.1.17163.1.1</sysoid>
        <collect>
          <includeGroup>mib2-X-interfaces</includeGroup>
          <includeGroup>riverbed-steelhead-scalars</includeGroup>
          <includeGroup>riverbed-steelhead-cpu-stats</includeGroup>
          <includeGroup>riverbed-steelhead-port-bandwidth</includeGroup>
        </collect>
      </systemDef>

    </systems>
  </snmp-collection>


  <snmp-collection name=”asterisk” snmpStorageFlag=”select”>
    <rrd step=”300″>
      <rra>RRA:AVERAGE:0.5:1:2016</rra>
      <rra>RRA:AVERAGE:0.5:12:1488</rra>
      <rra>RRA:AVERAGE:0.5:288:366</rra>
      <rra>RRA:MAX:0.5:288:366</rra>
      <rra>RRA:MIN:0.5:288:366</rra>
    </rrd>
    <groups>
      <!– Asterisk (Digium) MIBs –>
      <group name=”asterisk-scalars” ifType=”ignore”>
        <mibObj oid=”.1.3.6.1.4.1.22736.1.5.1″   instance=”0″ alias=”astNumChannels”      type=”gauge” />
        <mibObj oid=”.1.3.6.1.4.1.22736.1.5.5.1″ instance=”0″ alias=”astNumChanBridge”    type=”gauge” />
        <mibObj oid=”.1.3.6.1.4.1.22736.1.2.5″   instance=”0″ alias=”astCfgCallsActive”   type=”gauge” />
        <mibObj oid=”.1.3.6.1.4.1.22736.1.2.6″   instance=”0″ alias=”astCfgCallsPrcessed” type=”counter” />
      </group>
      <group name=”asterisk-chantype” ifType=”all”>
        <mibObj oid=”.1.3.6.1.4.1.22736.1.5.4.1.2″ instance=”astChanType” alias=”astChanTypeName”     type=”string” />
        <mibObj oid=”.1.3.6.1.4.1.22736.1.5.4.1.7″ instance=”astChanType” alias=”astChanTypeChannels” type=”gauge” />
      </group>
    </groups>
    <systems>
      <systemDef name=”Enterprise”>
        <sysoidMask>.1.3.6.1.4.1.</sysoidMask>
        <collect>
          <includeGroup>asterisk-scalars</includeGroup>
          <includeGroup>asterisk-chantype</includeGroup>
        </collect>


      </systemDef>
    </systems>
  </snmp-collection>

</datacollection-config>

You’ll need to restart OpenNMS, but first it’s a good idea to validate that you didn’t make any XML-borking mistakes. One good way to do this is to run the edited files through the xmllint utility, which is part of the libxml2 (Red Hat, Fedora, or CentOS) or libxml2-utils (Debian and Ubuntu) package.

opennms$ xmllint --noout capsd-configuration.xml collectd-configuration.xml datacollection-config.xml

If the command produces no output, you’re in good shape. If it finds problems, you’ll need to correct them before you can expect OpenNMS to restart successfully.

Once OpenNMS has restarted, log in to the OpenNMS web UI as a user with administrator privileges. Navigate to the Asterisk server’s node detail page, click on the Rescan link, and confirm that you wish to rescan the node. Give the rescan a few minutes to complete, then reload the node detail page. You should now see the Asterisk_SNMP service on one of the node’s interfaces:

If you see the new service (it’s OK and expected that its status is Unmonitored), then you’re just one more restart away from having Asterisk performance data in OpenNMS. The restarts, by the way, are another step that will go away in OpenNMS 1.8, which will include the ability to reload the configurations of individual daemons (in this case Capsd and Collectd) while the system is running. After the restart, log in again to the web UI and navigate to the resource graphs workflow for the Asterisk server node. You should see a new pick-list labeled Asterisk Channel Type containing an item for each type of channel technology supported by your Asterisk server:

Select Node-Level Performance Data and at least one channel type that you know is in use on your Asterisk server and click Submit. Amid the other node-level data, you should have four new resource graphs: Active Channels, Active and Bridged Channels (not present in Asterisk 1.4), Calls Active, and Calls Processed. Here’s how they look on a fairly slow day in our office:

After the node-level resource graphs will come one section per channel type that you selected. We’ve been using both SIP and Skype channels today, so those are the two I selected for this post:

Everybody likes resource graphs, but eventually everybody tires of looking at them and wishes that OpenNMS could notify the Asterisk admins when the data goes outside the comfort zone. That’s where thresholds come in. If you would like to be notified when a given Asterisk server has more than ten thousand bridged SIP channels (so that you can take up John Todd on his steak dinner offer), there’s a threshold for that:

There’s much more that can be done than I’ve shown here, particularly if you’re handy with Net-SNMP’s extensibility facilities. There are also other ways (such as changing the sysObjectId of your Asterisk servers to .1.3.6.1.4.1.22736.1) to switch on Asterisk data collection in OpenNMS; the method presented here is calculated to minimize the potential disruption to other data collection already happening in an environment.

Tags: , , , ,

Leave a Reply

You must be logged in to post a comment.