How To: Dell OpenManage, DRAC5, and OpenNMS on Redhat 5
Subscribe

From OpenNMS

Jump to: navigation, search

Contents

How To: Dell OpenManage, DRAC5, and OpenNMS on Redhat 5

This is how I got my Dell systems all hooked in to OpenNMS

Prerequisites

First, install OpenNMS by following the instructions at: http://www.opennms.org/wiki/Installation:Yum

Install OpenManage on your servers.

Install the net-snmp and optionally, net-snmp-utils packages.

Configure snmpd to let OM answer for the Dell MIB module and to send traps to OpenNMS by adding the following lines in /etc/snmp/snmpd.conf:

# Dell MIB OID
smuxpeer .1.3.6.1.4.1.674.10892.1
# Storage Management MIB OID
smuxpeer .1.3.6.1.4.1.674.10893.1
trapsink <your OpenNMS server> <your community string>

Tell the DRAC cards to send SNMP traps for monitored events. Manual at http://support.dell.com/support/edocs/software/smdrac3/drac5/om55/en/ug/index.htm

My personal preference is to use the command-line interface from clusterssh so I can configure several boxes at once.

Tell the DRAC to enable alerts for everything. If you want reboots, power offs, whatever, you need to RTFM and figure it out, but this will get your alerts on:

WARNING: This will override any Platform Event Filters you have already set up and replace them with alerts only!

%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 1 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 2
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 3 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 4 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 5 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 6 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 7 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 8 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 9 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 10 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 11
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 12 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 13 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 14 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 15 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 16 
%racadm config -g cfgIpmiPef -o cfgIpmiPefEnable -i 17 

%racadm config -g cfgIpmiPef -o cfgIpmiPefAction -i 1 0 
(repeat for -i 2 through -i 17 as above for cfgIpmiPefEnable)

If you read the Dell docs on this, they'll tell you that the PefAction is set bitwise and that bit 0 on indicates enabling an alert in response to an event, bit 1 on indicates powering down, bit 2 indicates a reboot, and bit 3 indicates powercycling. The way I read that, they're saying that setting PefAction value to 0 is no action, 1 is an alert, 2 is power down, 4 is a reboot, and 8 is a powercycle. This is NOT TRUE. PefAction is NOT set bitwise! Those are the VALUES to set to PefAction for the desired result.

If you read their docs and you've ever had even a rudimentary introduction to binary numbers, you could easily be misled into setting the value to 1 to turn on an alert. What you'll get is a dead server for any event that has PefAction set to 1, because it will power your box off!

You have been warned.

Next, enable global alerts:

%racadm config -g cfgIpmiLan -o cfgIpmiLanAlertEnable 1

Set your SNMP community string:

%racadm config -g cfgIpmiLan -o cfgIpmiPetCommunityName <community string goes here>


Now, set up trap destination 1 to be your OpenNMS box:

%racadm config -g cfgIpmiPet -o cfgIpmiPetAlertEnable -i 1 1 
%racadm config -g cfgIpmiPet -o cfgIpmiPetAlertDestIPAddr -i 1 <your OpenNMS server's IP goes here>

Set up any other trap destinations you may wish by changing the index (-i) value to 2, 3, or 4, for example:

%racadm config -g cfgIpmiPet -o cfgIpmiPetAlertEnable -i <2,3, or 4> 1 
%racadm config -g cfgIpmiPet -o cfgIpmiPetAlertDestIPAddr -i <2,3, or 4> <other SNMP trap receiver's IP goes here>

Optionally, configure DRAC cards to send email alerts. I'd rather get too many notifications that I've lost a power supply than to miss one, so I configure the card to send me an alert in addition to the alert I'll be getting from OpenNMS:

%racadm config -g cfgEmailAlert -o cfgEmailAlertEnable -i 1 1
%racadm config -g cfgEmailAlert -o cfgEmailAlertAddress -i 1 <e-mail_address>
%racadm config -g cfgEmailAlert -o cfgEmailAlertCustomMsg -i 1 <custom_message>

The custom message is something that will be added to every email the DRAC sends to this recipient. Just like the SNMP traps, you can have up to four recipients set up for email alerts. Just change the -i values to 2, 3, or 4 as appropriate.

Set your SMTP server (be sure you've allowed relaying from the DRAC's IP):

%racadm config -g cfgRemoteHosts -o cfgRhostsSmtpServerIpAddr <your SMTP server address>

I like to set my DRAC card DNS names so that something meaningful to me, rather than the DRAC card's service tag, shows up in emails:

%racadm config -g cfgLanNetworking -o cfgDNSRacName <some meaningful name>

Alright, now your Dell servers are all set up and should be sending traps to your OpenNMS box.

I wrote a perl script to configure my DRAC's. It makes all the above changes, and sends a test email and test trap.

Download and tweak to suit:

http://www.allpoints.net/opennms/configdrac.pl

Events and Alarms

I wanted every OpenManage event to be turned into an alert. In order to do that, I needed to modify my $OPENNMS_DIR/events/DellOpenManage.events.xml and create reduction keys for each event. I wanted to take advantage of the ability of OpenNMS to automatically clear alarms, so I made my reduction keys a combination of a short form of the UEI combined with the node id.

For example, take these two chassis intrusion events:

<uei>uei.opennms.org/vendor/Dell/traps/alertChassisIntrusionNormal</uei>
<alarm-data reduction-key="uei.opennms.org/vendor/Dell/traps/alertChassisIntrusion:%nodeid%" alarm-type="2" auto-clean="false" />

<uei>uei.opennms.org/vendor/Dell/traps/alertChassisIntrusionDetected</uei>
<alarm-data reduction-key="uei.opennms.org/vendor/Dell/traps/alertChassisIntrusion:%nodeid%" alarm-type="1" auto-clean="false" />

The reduction key for them is the same. I just set the alarm-type to "1" on the "intrusion detected" event to create the alarm, and to "2" on the "intrusion normal" event to automatically clear the alarm.

I used this script to kind of automate that process, but I still had to do some manual tweaking:

http://www.allpoints.net/opennms/mkalarms.pl

The resulting file (manual tweaks included) is here, if you'd like to just download and tweak it for your own use:

http://www.allpoints.net/opennms/DellOpenManage.events.xml

Notifications

Each OpenManage event that you want a notification for must be set up in your notifications.xml file.

I wrote a quick perl script to grab all the uei's from the $OPENNMS_DIR/events/DellOpenManage.events.xml file and create notifications for them. With some tweaking, it would probably work with some other events files, too and save you some typing:

http://www.allpoints.net/opennms/mkdelltrapnots.pl

You use it something like this:

grep uei events/DellOpenManage.events.xml | ./mkdelltrapnots.pl >> notifications.xml ; sed -i "s/<\/notifications>//" notifications.xml ; echo "</notifications>" >> notifications.xml

Ugly but effective.

Here's a notifications file that's all set up to send to the Email-Admin destination path:

http://www.allpoints.net/opennms/notifications.xml

Testing

Once you've got your config files all set up, restart OpenNMS, and its time to test.

Make sure that the node you're testing from has been discovered, or the notifications will never go out.

Here's how to test from a DRAC card:

%racadm testtrap -i 1

From the Web UI, click on Events and you should see something like this in the event history:

uei.opennms.org/vendor/Dell/traps/alertTestTrap

If everything's working right, you should also have an alarm with the same information, and you should receive the alerts via your configured means.

From your servers, make sure you have net-snmp-utils installed, and you can use this script to generate Dell traps:

http://www.allpoints.net/opennms/faketrap

Use it like this:

./faketrap <specific>

Where <specific> is the specific field from the trap you want to generate. For example:

./faketrap 1103

will generate a cooling device warning trap (generating an alarm, if everything is set up right), and:

./faketrap 1102

will generate a cooling device normal trap (clearing the alarm).


Good luck, and have fun!

Parks 18:22, 29 August 2009 (UTC)