From OpenNMS
Here is an overview of the design of the OpenNMS remote poller.
Contents |
Function
The new distributed poller functions in much the same way as the central poller --it uses the same polling configuration and the same set of set of monitors (the same monitor classes). The main differences are that new events are generated for services monitored from remote locations and new correlation logic must be used to determine outages (new enhanced correlation interfaces will be added with release 1.3.3). Still, these events can still be treated as Alarms, used to send notifications, and new user interface components have been added to visualize remotely polled services.
Several new entities have been added to OpenNMS to support this enhancement, not the least of which is the notion of applications (OnmsApplication). OpenNMS now allows you to aggregate a set of services to represent an application. With respect to distributed monitoring, these services can be monitored from multiple locations.
OnmsApplication
|
|
------->> OnmsMonitoredService
|
|
-------->> OnmsLocationMonitor
Location Monitors
The distributed monitors (OnmsLocationMonitor) are configured by location (OnmsLocationDefinition) and a polling package is associated with each location. Each location can have multiple distributed monitors using this configuration.
PollingConfig
|
|
----->> PollingPackage <---- OnmsMonitoringLocationDefinition
| |
| |
-------------------------------------->> OnmsLocationMonitor
The design of this new remote monitor includes two main software components, the Remote PollerFrontEnd (distributed poller) and the Remote PollerBackEnd (server side). The current communication protocol is Java RMI. HTTP will soon be added for better support through firewalls.
The remote monitor is passed the polling package associated with the specified location name, in this example: "RDU" and establishes a schedule for polling services as defined in this package. The schedule is optimized to reduce the initial load on the monitoring host platform by evenly spreading the service checks across the specified interval. Polling results and response time data is passed back to the OpenNMS server where status changes are recorded, events are broadcast, and RRD files are updated.
Events
<event>
<descr>
This event is sent when a service fails to respond to a remote
location monitor poll.
</descr>
<uei>uei.opennms.org/remote/nodes/nodeLostService</uei>
<event-label>A remote location monitor detected a node lost service</event-label>
<logmsg dest='logndisplay'>
A remote location monitor detected a node lost service.
</logmsg>
<severity>Normal</severity>
</event>
<event>
<descr>
This event is sent when a service responds after having failed to
respond to a remote location monitor poll.
</descr>
<uei>uei.opennms.org/remote/nodes/nodeRegainedService</uei>
<event-label>A remote location monitor detected a node regained service</event-label>
<logmsg dest='logndisplay'>
A remote location monitor detected a node regained service.
</logmsg>
<severity>Normal</severity>
</event>
<event>
<descr>
This event is sent by the remote location monitor server side API when
a remote location monitor is created.
</descr>
<uei>uei.opennms.org/remote/locationMontiorRegistered</uei>
<event-label>A remote location monitor has registered</event-label>
<logmsg dest='logndisplay'>
A remote location monitor has registered.
</logmsg>
<severity>Normal</severity>
</event>
<event>
<descr>
This event is sent when a registered remote location monitor begins
monitoring services defined in it's configuration.
</descr>
<uei>uei.opennms.org/remote/locationMonitorStarted</uei>
<event-label>A remote location monitor has started polling</event-label>
<logmsg dest='logndisplay'>
A remote location monitor has started polling.
</logmsg>
<severity>Normal</severity>
</event>
<event>
<descr>
This event is sent when a registered remote location monitor is configured with
a polling package containing no services to poll. (experimental)
The idea here is that a webui administrator can pause a remote location monitor
and the remote location monitor's configuration is changed to an empty
polling package.
</descr>
<uei>uei.opennms.org/remote/locationMonitorPaused</uei>
<event-label>A remote location monitor has been paused</event-label>
<logmsg dest='logndisplay'>
A remote location monitor has been paused.
</logmsg>
<severity>Minor</severity>
</event>
<event>
<descr>
This event is sent when a registered remote location monitor is cleaning
shutdown by the remote system.
</descr>
<uei>uei.opennms.org/remote/locationMonitorStopped</uei>
<event-label>A remote location monitor has been shutdown.</event-label>
<logmsg dest='logndisplay'>
A remote location monitor has been shutdown.
</logmsg>
<severity>Normal</severity>
</event>
<event>
<descr>
This event is sent when a registered remote location fails to report
status and check for configuration changes at the required interval.
</descr>
<uei>uei.opennms.org/remote/locationMonitorDisconnected</uei>
<event-label>A remote location monitor has disconnected</event-label>
<logmsg dest='logndisplay'>
A remote location monitor has disconnected.
</logmsg>
<alarm-data reduction-key="%uei%" alarm-type="1" auto-clean="false"/>
<severity>Warning</severity>
</event>
<event>
<descr>
This event is sent when a disconnected remote location monitor reconnects
and reports status changes and checks for configuration changes.
</descr>
<uei>uei.opennms.org/remote/locationMonitorReconnected</uei>
<event-label>A disconnected remote location monitor has reconnected</event-label>
<logmsg dest='logndisplay'>
A disconnected remote location monitor has reconnected.
</logmsg>
<severity>Normal</severity>
<alarm-data reduction-key="%uei%" alarm-type="2" clearUei="uei.opennms.org/remote/locationMonitorDisconnected" auto-clean="false"/>
</event>
<event>
<descr>
This event is sent when a remote location monitor's configuration has been
changed and was detected by the monitor.
</descr>
<uei>uei.opennms.org/remote/configurationChangeDetected</uei>
<event-label>A remote location monitor's configuration has been changed</event-label>
<logmsg dest='logndisplay'>
A remote location monitor's configuration has been changed.
</logmsg>
<severity>Normal</severity>
</event>
OpenNMS Server
The OpenNMS server has exported interfaces with Java RMI allowing communication with remote location monitors. The remote location monitors must be able to communicate with the OpenNMS server on TCP ports 1099 and 1199.
opennms.properties
A Java RMI property needs to be defined to allow better communication through firewalls and NAT/PAT configurations. This property defines a hostname that must be resolve to:
- an IP address on the OpenNMS server that will be used for communication with the remote location monitors
- an IP address resolvable by the remote location monitors that can be used for communicating with the OpenNMS server's RMI port
and must be the same name specified on the remote as in this property even if they resolve to a different IP address (i.e. NAT) and end up binding to the SAME interface.
java.rmi.server.hostname=onms-server-name
Example
Using the following IP network:
monitor onms server IP RMI hostname onms server IP monitor
"RDU-1" 172.16.1.1 ----------> NMS01 <---------- 192.168.1.1 "RDU-2"
10.1.1.1
/etc/hosts /etc/hosts /etc/hosts
172.16.1.1 NMS01 10.1.1.1 NMS01 192.168.1.1 NMS01
In this configuration, each host can resolve NMS01 and each monitor can communication with the OpenNMS server with an IP that routes correctly to the 10.1.1.1 interface on the OpenNMS server. In many networks, this configuration is much simpler where each host commicates with the OpenNMS server using the same IP address. This just explains the more complex scenario.
The location monitor initiates RMI communication with the OpenNMS server on TCP port 1099 and the OpenNMS server will direct it to port 1199 for operation.






