Remote Monitoring
Subscribe

From OpenNMS

Jump to: navigation, search

Contents

Overview

2006, the year of OpenNMS, is a year that has been witness to a sea of change to this vintage open source project (project number 4,141 of 132,384 registered projects) not the least of which is the most recent improvement, Distributed Monitoring. The OpenNMS developers provide the following terminology to help users better visualize this latest feature:

Distributed Polling 
Distributed polling is the act of distributing the workload of the OpenNMS system's polling requirements. It is a fully functional OpenNMS poller maintaining the status of its own set of entities by monitoring their services and correlating outages that may/may not be reported back to a central OpenNMS server. This functionality has *NOT* been added to OpenNMS.
Remote Polling Distributed Monitoring 
Remote Polling Distributed Monitoring is the act of changing distributing the visibility of the OpenNMS system's polling requirements. It is accomplished by using a light-weight poller that simply monitors the services of entities defined in the OpenNMS server from a remote location. This provides the visibility of status and responsiveness of services from one or more remote locations. This functionality describes the behavior of this feature: Remote Location Monitoring.
Monitoring Area 
A geographical area that can be assigned to a set of monitoring locations used for summarization of the status of services from several monitoring locations.
Monitoring Location 
A geographical area used to define a polling configuration (polling package) that is assigned to a set of location monitors.
Location Monitor 
A light-weight poller that receives a polling package from the OpenNMS server that simply checks and reports the status and responsiveness of each service in the polling package.
Polling Package 
A legacy OpenNMS configuration element that organizes the complexities of polling schedules, the set of services to be monitored, outage calendars, etc.

IT Executive Summary

OpenNMS has undergone vast changes this past year both architecturally and functionally. Architecturally, fundamental design changes have been made, such as: the implementation of a new domain model, DAOs, an ORM (Hibernate), and a Java/J2EE application framework (Spring Framework). These architectural changes have made it possible for the OpenNMS developers to deliver the latest functional enhancement, Distributed Monitoring. This new feature, coming with the 1.3.2 release scheduled for November 1 2006, gives OpenNMS visibility of monitored services from remote locations that also may not have been visible by centrally located OpenNMS instances. Often, network management servers are located in a data centers adjacent to the services that are being monitored. Now, centrally located OpenNMS management stations can have visibility of an entire network from multiple points of view.

The benefit is that, now, surveillance teams in the network operations center (NOC) can view the status and responsiveness of the enterprise's services from multiple locations, as seen by various strategically placed location monitors. These location monitors can be installed statically/permanently by system administrators in various locations throughout your enterprise or they can also be installed dynamically, "on-demand" so to speak, on any user's PC, by the user themselves, using Java WebStart technology. Dynamically instantiated location monitors provide instant user perspective feed back to the NOC staff. This saves time by reducing the complexity of troubleshooting the outage and by increasing the visibility of your enterprise's services.

Configuration

The OpenNMS system administrator defines the remote polling packages in the poller-configuration.xml file and assigns these packages, by name, to monitoring locations defined in the monitoring-locations.xml file. Polling packages have a new optional attribute called "remote" that defaults to false. Set this attribute to true if you want this package scheduled for polling only by remote monitors.

monitoring-locations.xml

The monitoring-locations.xml file defines the different locations from which remote poller monitoring instances will be running. Inside the <locations> tag, create one or more <location> entries with a set of attributes that uniquely identify it:

location-name 
The short name of the location, used on the remote-poller startup command-line.
monitoring-area 
Used to group multiple locations together.
polling-package-name 
The package in poller-configuration.xml that the monitor will use to determine the services to poll.
geolocation 
(As of OpenNMS 1.7.11) The geographical location of the monitor. This should be a street address or similar. If none is specified or Google can't resolve the address to a latitude and longitude, the marker will be placed on the map at OpenNMS World HQ in Pittsboro, NC.  :)
coordinates 
(As of OpenNMS 1.7.11) The geographical location of the monitor in the format "latitude,longitude".
priority 
(As of OpenNMS 1.7.11) The sort priority of this location for the UI (1 is lowest, 100 is highest).

It can also be optionally associated with 0 or more tags that identify a location. Generally these will be arbitrary metadata associated with that monitoring location.

<?xml version="1.0" encoding="UTF-8"?>
<monitoring-locations-configuration
  xmlns="http://www.opennms.org/xsd/config/monitoring-locations"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.opennms.org/xsd/config/monitoring-locations http://www.opennms.org/xsd/config/monitoring-locations.xsd ">
  <locations>
    <location-def location-name="RDU" monitoring-area="raleigh" polling-package-name="raleigh" geolocation="The OpenNMS Group, Pittsboro, NC" coordinates="35.7174,-79.1619" priority="50">
      <tags>
        <tag name="store" />
        <tag name="production" />
      </tags>
    </location-def>
  </locations>
</monitoring-locations-configuration>

poller-configuration.xml

In the poller-configuration.xml file, define one or more packages which will be associated with polling-package-name in monitoring-locations.xml. To do so, create a <package> entry with the remote="true" attribute set, and configure it like any other poller package. See the polling configuration HOWTO for details on configuration.

<package name="raleigh" remote="true">
  <filter>IPADDR IPLIKE *.*.*.*</filter>
  <include-range begin="1.1.1.1" end="254.254.254.254"/>
  <rrd step = "300">
    <rra>RRA:AVERAGE:0.5:1:2016</rra>
    <rra>RRA:AVERAGE:0.5:12:4464</rra>
    <rra>RRA:MIN:0.5:12:4464</rra>
    <rra>RRA:MAX:0.5:12:4464</rra>
  </rrd>
  <service name="HTTP" interval="30000" user-defined="false" status="on">
    <parameter key="retry" value="1"/>
    <parameter key="timeout" value="3000"/>
    <parameter key="port" value="80"/>
    <parameter key="url" value="/"/>
    <parameter key="rrd-repository" value="/var/log/opennms/rrd/response"/>
    <parameter key="ds-name" value="http"/>
  </service>
  <outage-calendar>zzz from poll-outages.xml zzz</outage-calendar>

  <downtime interval="30000" begin="0" end="300000"/>             <!-- 30s, 0, 5m -->
  <downtime interval="300000" begin="300000" end="43200000"/>     <!-- 5m, 5m, 12h -->
  <downtime interval="600000" begin="43200000" end="432000000"/>  <!-- 10m, 12h, 5d -->
  <downtime begin="432000000" delete="true"/>                     <!-- anything after 5 days delete -->
</package>

Note: some poller services are not supported by the remote poller. If you add these services to the poller package things might not work so well. ICMP is NOT supported. See this email exchange for more info: http://opennms.530661.n2.nabble.com/Opennms-Remote-Poller-Solution-td760449.html

service-configuration.xml

The PollerBackEnd section of the service-configuration.xml file also needs to be uncommented, which I believe it is by default.

<service>
	<name>OpenNMS:Name=PollerBackEnd</name>
	<class-name>org.opennms.netmgt.poller.jmx.RemotePollerBackEnd</class-name>
	<invoke at="start" pass="0" method="init"/>
	<invoke at="start" pass="1" method="start"/>
	<invoke at="status" pass="0" method="status"/>
	<invoke at="stop" pass="0" method="stop"/>
</service>

opennms.properties

If you want to be able to use the Google Maps based monitoring UI (available in OpenNMS 1.7.11 and higher), you will need to get an API key from Google for the host or domain name that your OpenNMS instance is running on.

To do so, go to the Google Maps API Key Signup, and enter your domain or host name.

Then, edit your opennms.properties file, and change the gwt.apikey entry to use the key you received from Google.

 gwt.apikey=ABQIAAAAo3Ut2fYegRgs1h9qjcxS1hQJ4VJnazzSzZa7E_xphHimSSS1jRTkb8XiWKaPjA3HehBg5oRrrWLSDw

Defining an Application

Next thing to do for remote polling configuration is to define Applications from the Admin menu for which the remote pollers will be verifying and reporting status of services.

Starting a Remote Monitor

A remote poller can be started either by command-line mode or via WebStart from a URL in the OpenNMS WebUI. If this is the first time the monitor has been run, it attempts to register itself as a location monitor with the OpenNMS server. Communication is established by specifying the RMI URL to the OpenNMS server and the monitoring location-name as defined by the OpenNMS administrator in monitoring-locations.xml file, as previously described.

Command-Line

OpenNMS 1.7.10 and Later

To start the remote monitor from the command-line, run:

 # GUI
 $OPENNMS_HOME/bin/remote-poller.sh -g -u rmi://my-opennms -l RDU
 # Headless
 $OPENNMS_HOME/bin/remote-poller.sh -u rmi://my-opennms -l RDU

In 1.7.10, the remote poller can now communicate with the OpenNMS host system through HTTP, as well as RMI.

 # GUI
 $OPENNMS_HOME/bin/remote-poller.sh -g -u http://my-opennms:8980/opennms-remoting -l RDU
 # Headless
 $OPENNMS_HOME/bin/remote-poller.sh -u http://my-opennms:8980/opennms-remoting  -l RDU

You can also add "-n <username>" and "-p <password>" for authentication when remoting through HTTP, to avoid the password prompt.

If you need to use an HTTP proxy to communicate with the OpenNMS server, add the http.proxyHost and http.proxyPort options to the java command-line:

 $OPENNMS_HOME/bin/remote-poller.sh -Dhttp.proxyHost=proxy.mydomain.net  -Dhttp.proxyPort=8080 -g -l TEST -u http://my-opennms:8980/opennms-remoting

OpenNMS 1.6.x

 # GUI
 $OPENNMS_HOME/bin/remote-poller.sh -g -u rmi://my-opennms -l RDU
 # Headless
 $OPENNMS_HOME/bin/remote-poller.sh -u rmi://my-opennms -l RDU

Java Web Start

It is also possible to run the remote poller using Java Web Start. To start a distributed monitor in a GUI on a remote workstation, from that workstation browse to:

 # OpenNMS 1.6
 http://<opennms server>:8980/opennms/webstart/app.jnlp
 # OpenNMS 1.7.10+
 http://<opennms server>:8980/opennms-remoting/webstart/app.jnlp

As of OpenNMS 1.5.94, the headless version of the remote poller is also available through web start, although you must first "set it up" at least once with the GUI version. It can be accessed at:

 # OpenNMS 1.6
 http://<opennms server>:8980/opennms/webstart/headless.jnlp
 # OpenNMS 1.7.10+
 http://<opennms server>:8980/opennms-remoting/webstart/headless.jnlp

Remote Monitor UI

OpenNMS 1.6.x through 1.7.10

The remote monitor has a simple table-based UI which displays application and monitor status. It's available at http://your-opennms-host/opennms/distributedStatusSummary.htm.

OpenNMS 1.7.11 and Higher

As of OpenNMS 1.7.11, a new remote monitor UI has been added which uses Google Maps for displaying remote monitor locations, as well as application, service, and monitor status.

It is currently available at http://your-opennms-host/opennms-remote-monitor-ui/ although it will be integrated into the main web application soon.

Design Overview

Function

The new distributed poller functions in much the same way as the central poller --it uses the same polling configuration and the same set of set of monitors (the same monitor classes). The main differences are that new events are generated for services monitored from remote locations and new correlation logic must be used to determine outages (new enhanced correlation interfaces will be added with release 1.3.3). Still, these events can still be treated as Alarms, used to send notifications, and new user interface components have been added to visualize remotely polled services.

Several new entities have been added to OpenNMS to support this enhancement, not the least of which is the notion of applications (OnmsApplication). OpenNMS now allows you to aggregate a set of services to represent an application. With respect to distributed monitoring, these services can be monitored from multiple locations.

OnmsApplication 
          |
          |
          ------->> OnmsMonitoredService
                                   |
                                   |
                                   -------->> OnmsLocationMonitor

Location Monitors

The distributed monitors (OnmsLocationMonitor) are configured by location (OnmsLocationDefinition) and a polling package is associated with each location. Each location can have multiple distributed monitors using this configuration.

PollingConfig
         |
         |
         ----->> PollingPackage  <---- OnmsMonitoringLocationDefinition
                           |               |
                           |               |
                           -------------------------------------->> OnmsLocationMonitor
         

The design of this new remote monitor includes two main software components, the Remote PollerFrontEnd (distributed poller) and the Remote PollerBackEnd (server side). The current communication protocol is Java RMI. HTTP will soon be added for better support through firewalls.

The remote monitor is passed the polling package associated with the specified location name, in this example: "RDU" and establishes a schedule for polling services as defined in this package. The schedule is optimized to reduce the initial load on the monitoring host platform by evenly spreading the service checks across the specified interval. Polling results and response time data is passed back to the OpenNMS server where status changes are recorded, events are broadcast, and RRD files are updated.

Events


  <event>
    <descr>
      This event is sent when a service fails to respond to a remote
      location monitor poll.
    </descr>
    <uei>uei.opennms.org/remote/nodes/nodeLostService</uei>
    <event-label>A remote location monitor detected a node lost service</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor detected a node lost service.
    </logmsg>
    <severity>Normal</severity>
  </event>

  <event>
    <descr>
      This event is sent when a service responds after having failed to
      respond to a remote location monitor poll.
    </descr>
    <uei>uei.opennms.org/remote/nodes/nodeRegainedService</uei>
    <event-label>A remote location monitor detected a node regained service</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor detected a node regained service.
    </logmsg>
    <severity>Normal</severity>
  </event>

  <event>
    <descr>
      This event is sent by the remote location monitor server side API when
      a remote location monitor is created.
    </descr>
    <uei>uei.opennms.org/remote/locationMontiorRegistered</uei>
    <event-label>A remote location monitor has registered</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor has registered.
    </logmsg>
    <severity>Normal</severity>
  </event>

  <event>
    <descr>
      This event is sent when a registered remote location monitor begins
      monitoring services defined in it's configuration.
    </descr>
    <uei>uei.opennms.org/remote/locationMonitorStarted</uei>
    <event-label>A remote location monitor has started polling</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor has started polling.
    </logmsg>
    <severity>Normal</severity>
  </event>

  <event>
    <descr>
      This event is sent when a registered remote location monitor is configured with
      a polling package containing no services to poll. (experimental)

      The idea here is that a webui administrator can pause a remote location monitor
      and the remote location monitor's configuration is changed to an empty
      polling package.
    </descr>
    <uei>uei.opennms.org/remote/locationMonitorPaused</uei>
    <event-label>A remote location monitor has been paused</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor has been paused.
    </logmsg>
    <severity>Minor</severity>
  </event>

  <event>
    <descr>
      This event is sent when a registered remote location monitor is cleaning
      shutdown by the remote system.
    </descr>
    <uei>uei.opennms.org/remote/locationMonitorStopped</uei>
    <event-label>A remote location monitor has been shutdown.</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor has been shutdown.
    </logmsg>
    <severity>Normal</severity>
  </event>

  <event>
    <descr>
      This event is sent when a registered remote location fails to report
      status and check for configuration changes at the required interval.
    </descr>
    <uei>uei.opennms.org/remote/locationMonitorDisconnected</uei>
    <event-label>A remote location monitor has disconnected</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor has disconnected.
    </logmsg>
    <alarm-data reduction-key="%uei%" alarm-type="1" auto-clean="false"/>
    <severity>Warning</severity>
  </event>

  <event>
    <descr>
      This event is sent when a disconnected remote location monitor reconnects
      and reports status changes and checks for configuration changes.
    </descr>
    <uei>uei.opennms.org/remote/locationMonitorReconnected</uei>
    <event-label>A disconnected remote location monitor has reconnected</event-label>
    <logmsg dest='logndisplay'>
      A disconnected remote location monitor has reconnected.
    </logmsg>
    <severity>Normal</severity>
    <alarm-data reduction-key="%uei%" alarm-type="2" clearUei="uei.opennms.org/remote/locationMonitorDisconnected" auto-clean="false"/>
  </event>

  <event>
    <descr>
      This event is sent when a remote location monitor's configuration has been
      changed and was detected by the monitor.
    </descr>
    <uei>uei.opennms.org/remote/configurationChangeDetected</uei>
    <event-label>A remote location monitor's configuration has been changed</event-label>
    <logmsg dest='logndisplay'>
      A remote location monitor's configuration has been changed.
    </logmsg>
    <severity>Normal</severity>
   </event>

OpenNMS Server

The OpenNMS server has exported interfaces with Java RMI allowing communication with remote location monitors. The remote location monitors must be able to communicate with the OpenNMS server on TCP ports 1099 and 1199.

opennms.properties

A Java RMI property needs to be defined to allow better communication through firewalls and NAT/PAT configurations. This property defines a hostname that must be resolve to:

  • an IP address on the OpenNMS server that will be used for communication with the remote location monitors
  • an IP address resolvable by the remote location monitors that can be used for communicating with the OpenNMS server's RMI port

and must be the same name specified on the remote as in this property even if they resolve to a different IP address (i.e. NAT) and end up binding to the SAME interface.

 java.rmi.server.hostname=onms-server-name

Example

Using the following IP network:

monitor  onms server IP         RMI hostname        onms server IP   monitor
"RDU-1"   172.16.1.1   ----------> NMS01 <---------- 192.168.1.1     "RDU-2"
                                 10.1.1.1

/etc/hosts                       /etc/hosts                          /etc/hosts
172.16.1.1   NMS01               10.1.1.1   NMS01                    192.168.1.1   NMS01

In this configuration, each host can resolve NMS01 and each monitor can communication with the OpenNMS server with an IP that routes correctly to the 10.1.1.1 interface on the OpenNMS server. In many networks, this configuration is much simpler where each host commicates with the OpenNMS server using the same IP address. This just explains the more complex scenario.

The location monitor initiates RMI communication with the OpenNMS server on TCP port 1099 and the OpenNMS server will direct it to port 1199 for operation.


Version History/Availability