From OpenNMS
Contents |
HOWTO: Monitoring HP Insight Manager using OpenNMS
If you using HP Servers with installed HP Insight Manager it could be very helpful to know if your Servers health is fine. The HP Insight Manager uses MIBs from Compaq, so you´ll find the necessary MIBS for example on Byte Spheres oidview
Prerequisites
To make this work you need installed HP Insight Manager on your Server. Your OpenNMS is up and running. You have configured SNMP on your HP Server and OpenNMS correctly. This configuration uses the feature to detecting and monitoring SNMP-Tables. Detecting SNMP tables with capsd is since OpenNMS 1.6.7 possible and monitoring SNMP tables with pollerd is possible since 1.6.3. If you have OpenNMS lower 1.6.7 you can assign the service instead of capsd with provisioning or a LoopPlugin.
This monitor supports the following SNMP OIDS from CPQIDA-MIB and CPQHLTH-MIB:
cpqDaSpareStatus .1.3.6.1.4.1.232.3.2.4.1.1.3 cpqDaLogDrvStatus .1.3.6.1.4.1.232.3.2.3.1.1.4 cpqDaPhyDrvStatus .1.3.6.1.4.1.232.3.2.5.1.1.6
cpqHeThermalTempStatus .1.3.6.1.4.1.232.6.2.6.3.0 cpqHeThermalSystemFanStatus .1.3.6.1.4.1.232.6.2.6.4.0 cpqHeThermalCpuFanStatus .1.3.6.1.4.1.232.6.2.6.5.0 cpqHeFltTolPowerSupplyStatus .1.3.6.1.4.1.232.6.2.9.3.1.5 cpqHeResMemModuleCondition .1.3.6.1.4.1.232.6.2.14.11.1.5
Detecting with capsd
First of all you need a new service in your capsd-configuration. We use the generic SNMP-Monitor which comes with OpenNMS. For detecting the tables, I test an "ok" or "good" state.
$OPENNMS_HOME/etc/capsd-configuration.xml
<!-- HP Insight Manager -->
<!-- Detecting physical and logical drives -->
<protocol-plugin protocol="HP-Insight-Drive-Spare" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.3.2.4.1.1.3" />
<property key="table" value="true" />
<property key="vbvalue" value="4" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<protocol-plugin protocol="HP-Insight-Drive-Logical" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.3.2.3.1.1.4" />
<property key="table" value="true" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<protocol-plugin protocol="HP-Insight-Drive-Physical" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.3.2.5.1.1.6" />
<property key="table" value="true" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<!-- Detecting temperature status -->
<protocol-plugin protocol="HP-Insight-Temperature" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.6.2.6.3.0" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<!-- Detecting Fan status -->
<protocol-plugin protocol="HP-Insight-Fan-System" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.6.2.6.4.0" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<protocol-plugin protocol="HP-Insight-Fan-CPU" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.6.2.6.5.0" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<!-- Detecting Power supply status -->
<protocol-plugin protocol="HP-Insight-Power-Supply" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.6.2.9.3.1.5" />
<property key="table" value="true" />
<property key="vbvalue" value="1" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
<!-- Detecting Memory module condition -->
<protocol-plugin protocol="HP-Insight-Memory-Module" class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on">
<property key="vbname" value=".1.3.6.1.4.1.232.6.2.14.11.1.5" />
<property key="table" value="true" />
<property key="vbvalue" value="2" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>
Monitoring all this services
After this you need to create a monitor in your polling-configuration. We monitor the state normal. The following states are possible:
CPQIDA-MIB
cpqDaSpareStatus .1.3.6.1.4.1.232.3.2.4.1.1.3
- other(1)
- invalid(2)
- failed(3)
- inactive(4) - UP
- building(5)
- active(6)
cpqDaLogDrvStatus .1.3.6.1.4.1.232.3.2.3.1.1.4
- other(1)
- ok(2) - UP
- failed(3)
- unconfigured(4)
- recovering(5)
- readyForRebuild(6)
- rebuilding(7)
- wrongDevice(8)
- badConnect(9)
- overheating(10)
- shutdown(11)
- expanding(12)
- notAvailable(13)
- queuedForExpansion(14)
- multipathAccessDegraded(15)
- earsing(16)
cpqDaPhyDrvStatus .1.3.6.1.4.1.232.3.2.5.1.1.6
- other(1)
- ok(2) - UP
- failed(3)
- predictiveFailure(4)
- erasing(5)
- eraseDone(6)
- eraseQueued(7)
CPQHLTH-MIB
cpqHeThermalTempStatus .1.3.6.1.4.1.232.6.2.6.3.0
- other(1)
- ok(2) - UP
- degraded(3)
- failed(4)
cpqHeThermalSystemFanStatus .1.3.6.1.4.1.232.6.2.6.4.0
- other(1)
- ok(2) - UP
- degraded(3)
- failed(4)
cpqHeThermalCpuFanStatus .1.3.6.1.4.1.232.6.2.6.5.0
- other(1)
- ok(2) - UP
- degraded(3)
- failed(4)
cpqHeFltTolPowerSupplyStatus .1.3.6.1.4.1.232.6.2.9.3.1.5
- noError(1) - UP
- generalFailure(2)
- bistFailure(3)
- fanFailure(4)
- tempFailure(5)
- interlockOpen(6)
- epromFailed(7)
- vrefFailed(8)
- dacFailed(9)
- ramTestFailed(10)
- voltageChannelFailed(11)
- orringdiodeFailed(12)
- brownOut(13)
- giveupOnStartup(14)
- nvramInvalid(15)
- calibrationTableInvalid(16)
cpqHeResMemModuleCondition .1.3.6.1.4.1.232.6.2.14.11.1.5
- other(1) - UP (I´ve seen servers with other(1) and HP Insight Manager said system is fine)
- ok(2) - UP
- degraded(3)
$OPENNMS_HOME/etc/poller-configuration.xml
<service name="HP-Insight-Drive-Spare" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.3.2.4.1.1.3"/>
<parameter key="walk" value="true"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="4"/>
<parameter key="match-all" value="true"/>
<parameter key="reason-template" value="One or more spare drives are not inactive. The state should be \
inactive(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), invalid(2), failed(3), inactive(4), building(5), active(6) "/>
</service>
<service name="HP-Insight-Drive-Logical" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.3.2.3.1.1.4"/>
<parameter key="walk" value="true"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="2"/>
<parameter key="match-all" value="true"/>
<parameter key="reason-template" value="One or more logical drives are not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), ok(2), failed(3), unconfigured(4), recovering(5), readyForRebuild(6), rebuilding(7), \
wrongDevice(8), badConnect(9), overheating(10), shutdown(11), expanding(12), notAvailable(13), \
queuedForExpansion(14), multipathAccessDegraded(15), earsing(16) "/>
</service>
<service name="HP-Insight-Drive-Physical" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.3.2.5.1.1.6"/>
<parameter key="walk" value="true"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="2"/>
<parameter key="match-all" value="true"/>
<parameter key="reason-template" value="One or more physical drives are not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), ok(2), failed(3), predictiveFailure(4), erasing(5), eraseDone(6), eraseQueued(7)"/>
</service>
<service name="HP-Insight-Temperature" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.6.3.0"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="2"/>
<parameter key="reason-template" value="Temperature status is not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), ok(2), degraded(3), failed(4)"/>
</service>
<service name="HP-Insight-Fan-System" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.6.4.0"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="2"/>
<parameter key="reason-template" value="System fan status is not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), ok(2), degraded(3), failed(4)"/>
</service>
<service name="HP-Insight-Fan-CPU" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.6.5.0"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="2"/>
<parameter key="reason-template" value="CPU fan status is not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight \
Syntax: other(1), ok(2), degraded(3), failed(4)"/>
</service>
<service name="HP-Insight-Power-Supply" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.9.3.1.5"/>
<parameter key="walk" value="true"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="1"/>
<parameter key="match-all" value="true"/>
<parameter key="reason-template" value="One or more power supplies are not ok. The state should be \
noError(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: noError(1), generalFailure(2), bistFailure(3), fanFailure(4), tempFailure(5), interlockOpen(6), \
epromFailed(7), vrefFailed(8), dacFailed(9), ramTestFailed(10), voltageChannelFailed(11), \
orringdiodeFailed(12), brownOut(13), giveupOnStartup(14), nvramInvalid(15), calibrationTableInvalid(16)"/>
</service>
<service name="HP-Insight-Memory-Module" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="6"/>
<parameter key="timeout" value="4950"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.14.11.1.5"/>
<parameter key="walk" value="true"/>
<parameter key="operator" value="<"/>
<parameter key="operand" value="3"/>
<parameter key="match-all" value="true"/>
<parameter key="reason-template" value="One or more memory modules are not ok. The state should be \
ok(${operand}) the observed value is ${observedValue}. Please check your HP Insight Manager. \
Syntax: other(1), ok(2), degraded(3)"/>
</service>
Do not forget to activate the monitor at the of the file.
<monitor service="HP-Insight-Drive-Spare" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Drive-Logical" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Drive-Physical" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Temperature" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Fan-System" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Fan-CPU" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Power-Supply" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/> <monitor service="HP-Insight-Memory-Module" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>






