Monitoring Nginx with the HTTP collector
Subscribe

From OpenNMS

Jump to: navigation, search

Thanks to all people from #opennms irc.freenode.net (specially _indigo). I copy&paste&edit this doc from Apache HTTP collector (Thanks too).

Contents

Nginx configuration

Inside your nginx config file you need to enable status

        location /nginx_status {
           stub_status on;
           access_log   off;
           allow IP-OpenNMS;
           deny all;
       }

You will need to restart the nginx web server for the changes to take effect.

Nginx Status

This its a normal output from http://server/nginx_status

Active connections: 20
server accepts handled requests
 79933 79933 80210 
Reading: 10 Writing: 1 Waiting: 3

We can use the HTTP collector's regular expression capabilities to collect these metrics and tuck them away in RRDs for graphing and thresholding.

Configuring service discovery

I assumed that any machine that responds with an HTTP 200 response code when asked for /nginx_status on port 80 is running. Here's my capds-configuration.xml

  <protocol-plugin protocol="nginx" class-name="org.opennms.netmgt.capsd.plugins.HttpPlugin"  scan="on" user-defined="false">
      <property key="port" value="80" />
      <property key="timeout" value="3000" />
      <property key="retry" value="2" />
      <property key="url" value="/nginx_status" />
  </protocol-plugin>

This defines a new service, "nginx". The name is not critical, but is does need to be consistent. I did not define a poller monitor for the service, all I needed to do is to discover the service so that I could use the name later on in my data collection configuration.

At this point, I bounced OpenNMS and rescanned a node that I knew offered the service to ensure that the service could be discovered. Sure enough the service showed up as "Not Monitored" in the appropriate node view.

collectd-configuration.xml

Now I need to add this to services :

	<service name="nginx" interval="300000" user-defined="false" status="on" >
     		<parameter key="collection" value="nginx_stats" />
     		<parameter key="retry" value="1" />
     		<parameter key="timeout" value="2000" />
   	</service>


Note that the service name must match the service name in capsd-configuration.xml. The collection parameter value is used later on in http-datacollection-config.xml. Further down the file, outside of the package definitions, I added a service to class mapping for the service:

<collector service="nginx" class-name="org.opennms.netmgt.collectd.HttpCollector" />

httpd-datacollection-config.xml

Copy this :

  <http-collection name="nginx_stats">
    <rrd step="300">
      <rra>RRA:AVERAGE:0.5:1:8928</rra>
      <rra>RRA:AVERAGE:0.5:12:8784</rra>
      <rra>RRA:MIN:0.5:12:8784</rra>
      <rra>RRA:MAX:0.5:12:8784</rra>
    </rrd>
    <uris>
      <uri name="nginx_stats">
        <url path="/nginx_status/"
             user-agent="OpenNMS HTTP Datacollection" 
             matches="(?s).*Active connections.\s([0-9]+).*\n.([0-9]+)\s([0-9]+)\s([0-9]+).\nReading:\s([0-9]+)\sWriting:\s([0-9]+)\sWaiting.\s([0-9]+).*" response-range="100-399">
       </url>
       <attributes>
          <attrib alias="nginxActive" match-group="1" type="gauge32"/>
          <attrib alias="nginxAccepts" match-group="2" type="counter32"/>
          <attrib alias="nginxHandled" match-group="3" type="counter32"/>
          <attrib alias="nginxRequests" match-group="4" type="counter32"/>
          <attrib alias="nginxReading" match-group="5" type="gauge32"/>
          <attrib alias="nginxWriting" match-group="6" type="gauge32"/>
          <attrib alias="nginxWaiting" match-group="7" type="gauge32"/>
        </attributes>
      </uri>
    </uris>
  </http-collection>

Drawing the Graphs

Edit the file snmp-graph.properties and add this :

reports=nginx.server.connections, nginx.server.stats, nginx.server.active, \
report.nginx.server.connections.name=NGINX Connections
report.nginx.server.connections.columns=nginxReading, nginxWriting, nginxWaiting
report.nginx.server.connections.type=nodeSnmp
report.nginx.server.connections.command=--title="NGINX Connections" \
 --vertical-label="count" \
 --lower-limit 0 \
 DEF:avgReading={rrd1}:nginxReading:AVERAGE \
 DEF:minReading={rrd1}:nginxReading:MIN \
 DEF:maxReading={rrd1}:nginxReading:MAX \
 DEF:avgWriting={rrd2}:nginxWriting:AVERAGE \
 DEF:minWriting={rrd2}:nginxWriting:MIN \
 DEF:maxWriting={rrd2}:nginxWriting:MAX \
 DEF:avgWaiting={rrd3}:nginxWaiting:AVERAGE \
 DEF:minWaiting={rrd3}:nginxWaiting:MIN \
 DEF:maxWaiting={rrd3}:nginxWaiting:MAX \
 AREA:avgReading#0000ff:"Reading" \
 GPRINT:avgReading:AVERAGE:"Avg \\: %10.2lf %s" \
 GPRINT:minReading:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxReading:MAX:"Max \\: %10.2lf %s\n" \
 STACK:avgWriting#ff0000:"Writing" \
 GPRINT:avgWriting:AVERAGE:"Avg \\: %10.2lf %s" \
 GPRINT:minWriting:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxWriting:MAX:"Max \\: %10.2lf %s\n" \
 STACK:avgWaiting#ffba00:"Waiting" \
 GPRINT:avgWaiting:AVERAGE:"Avg \\: %10.2lf %s" \
 GPRINT:minWaiting:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxWaiting:MAX:"Max \\: %10.2lf %s\\n"
 
report.nginx.server.stats.name=NGINX Statistics
report.nginx.server.stats.columns=nginxAccepts, nginxHandled, nginxRequests
report.nginx.server.stats.type=nodeSnmp
report.nginx.server.stats.command=--title="NGINX Statistics" \
 --vertical-label="count" \
 --lower-limit 0 \
 DEF:avgAccept={rrd1}:nginxAccepts:AVERAGE \
 DEF:minAccept={rrd1}:nginxAccepts:MIN \
 DEF:maxAccept={rrd1}:nginxAccepts:MAX \
 DEF:avgHandled={rrd2}:nginxHandled:AVERAGE \
 DEF:minHandled={rrd2}:nginxHandled:MIN \
 DEF:maxHandled={rrd2}:nginxHandled:MAX \
 DEF:avgRequests={rrd3}:nginxRequests:AVERAGE \
 DEF:minRequests={rrd3}:nginxRequests:MIN \
 DEF:maxRequests={rrd3}:nginxRequests:MAX \
 AREA:avgAccept#0000ff:"Accept" \
 GPRINT:avgAccept:AVERAGE:"  Avg \\: %10.2lf %s" \
 GPRINT:minAccept:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxAccept:MAX:"Max \\: %10.2lf %s\n" \
 STACK:avgHandled#ff0000:"Handled" \
 GPRINT:avgHandled:AVERAGE:" Avg \\: %10.2lf %s" \
 GPRINT:minHandled:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxHandled:MAX:"Max \\: %10.2lf %s\n" \
 STACK:avgRequests#ffba00:"Requests" \
 GPRINT:avgRequests:AVERAGE:"Avg \\: %10.2lf %s" \
 GPRINT:minRequests:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxRequests:MAX:"Max \\: %10.2lf %s\\n"
 
report.nginx.server.active.name=NGINX Active Connections
report.nginx.server.active.columns=nginxActive
report.nginx.server.active.type=nodeSnmp
report.nginx.server.active.command=--title="NGINX active connections" \
 --vertical-label="count" \
 --lower-limit 0 \
 DEF:avgActive={rrd1}:nginxActive:AVERAGE \
 DEF:minActive={rrd1}:nginxActive:MIN \
 DEF:maxActive={rrd1}:nginxActive:MAX \
 AREA:avgActive#0000ff:"Active" \
 GPRINT:avgActive:AVERAGE:"Avg \\: %10.2lf %s" \
 GPRINT:minActive:MIN:"Min \\: %10.2lf %s" \
 GPRINT:maxActive:MAX:"Max \\: %10.2lf %s\\n"