The post SNMP Network Monitoring with OpenNMS Meridian appeared first on The OpenNMS Group, Inc..
]]>In this blog post, I’ll cover how OpenNMS Meridian can leverage SNMP network monitoring. SNMP or the Simple Network Management Protocol works as a client-server model where clients or agents provide information back to a central management server, Meridian, in our case. The management server can query specific information and events from the agents.
By default, Meridian collects information from SNMP devices of Report 161 and uses a read community string of “public.” As a best practice, though, community strings should be changed to something unique on your network devices. Meridian provides support for SNMP versions 1, 2c, and 3.
SNMP is a protocol for collecting, managing, and organizing information for network devices. It's a common standard for network monitoring. SNMP can help network administrators identify overloaded network connections and get alerts on hardware status, temperature, and potential failures.
Imagine you're a network administrator for a regional internet service provider, and you've got a group of customers who are complaining about intermittent service. It's possible to use SNMP data and Meridian to find out which connections are problematic. SNMP can also be used with servers to get information on resource usage, such as disk consumption and memory utilization. It is also used in data centers to get information from power supplies, AC units, and many other systems.
SNMP has five major components:
It might be useful to think about SNMP by imagining you're the coach or manager of a sports team. The SNMP manager, that's you. You're responsible for managing and monitoring the team's performance and making changes when necessary. SNMP agents are the individual players on your team. Each player or agent provides specific information about their performance to you, the manager, when requested.
You can think of MIB, or the management information base, as a playbook or record that helps maintain information about the players. It contains information like positions, stats, and potential actions the players might take in the game.
SNMP commands are like the instructions or requests you give to the players during the game. For instance, if you want to know how many goals a specific player has scored, you ask for that information. SNMP traps are like unsolicited updates from players during the game. For example, if a player gets injured, they notify you immediately so that you can make substitutions or adjust your strategy accordingly. For SNMP v3, the terms manager and agent change to command generator and command responder.
By default, OpenNMS Meridian will discover and begin collecting SNMP network data from managed nodes using the default community string "public" and Port 161. Meridian collects data every five minutes and stores it in RRD files. The collection interval can be changed to enable faster or slower collection. The data can also be stored in a time series database, which we'll explore later.
The collectd or collection daemon within OpenNMS is responsible for collecting SNMP network data. Though we often refer to SNMP polling, collectd collects information and stores it within the RRDs. Pollerd or the poller daemon is used for service assurance and verifies the SNMP service is running on monitored devices.
Meridian can collect things like bits in and out, discards or errors, link quality (particularly useful for fiber optic links), system temperature, memory usage, storage space, and more, depending on what the device provides.
In addition to collecting data from SNMP devices, Meridian can also collect SNMP traps. Traps are events sent from devices with SNMP to a collector or a server, and Meridian can process them as events. The event definitions that Meridian uses come from the MIBs provided to it.
Events are structured historical records of things that happen in Meridian, along with the nodes, interfaces, and services it monitors. They are central to the operation of Meridian, and it's safe to assume that whenever something in Meridian appears to be working by magic, it's probably events working in the background. You can use events to send notifications, trigger automation, and translated into other events.
Performance data in Meridian can be stored using time series databases. By default, Meridian uses RRD files to store collected data. For small to medium-sized customers and users just looking to kick the tires, RRDs are a great option. They're incredibly performant, take up little disc space, and work out of the box.
If users want to set up their own time series database, we provide a component for Apache Cassandra called Newts that can be leveraged. This requires installing and maintaining an Apache Cassandra cluster. We also provide community support for Cortex and compatible derivatives, such as Prometheus and Thanos.
Meridian overcomes the challenges of collecting large amounts of SNMP data by distributing collection across minions. Minions are lightweight, stateless services that add monitoring capacitance and reduce the load on the Meridian core. By enabling horizontal scaling of collection, minions can be deployed as virtual machines, containers, or even physical servers. Minions can also enable collection from overlapping subnets.
We've covered the insight and intelligence that can be gained by using OpenNMS Meridian with SNMP network monitoring. However, OpenNMS Meridian goes beyond just flows and SNMP. It can ingest telemetry, syslog data, WMI, JDBC, JMX, and much more. OpenNMS Meridian is the ideal platform to provide holistic insight into your network.
If you just want to try it yourself, look at OpenNMS Horizon, our open-source, community-supported edition, where we constantly add product enhancements before they reach stability in our enterprise-grade product, Meridian.
The post SNMP Network Monitoring with OpenNMS Meridian appeared first on The OpenNMS Group, Inc..
]]>The post March 2024 Releases – Horizon 33.0.2, 2023.1.14, 2022.1.26, 2021.1.37 appeared first on The OpenNMS Group, Inc..
]]>In March, we released updates to all OpenNMS Meridian versions under active support, as well as Horizon 33.0.2.
Meridians 2023.1.14, 2022.1.26, 2021.1.37 contains a bunch of bug fixes and enhancements.
For a list of changes, see the release notes:
Release 33.0.2 contains a bunch of bug fixes and enhancements.
For a high-level overview of what has changed in Horizon 33, see What’s New in OpenNMS Horizon 33.
For a complete list of changes, see the changelog.
The codename for Horizon 33.0.2 is Old Man's Beard.
The post March 2024 Releases – Horizon 33.0.2, 2023.1.14, 2022.1.26, 2021.1.37 appeared first on The OpenNMS Group, Inc..
]]>The post Enterprise Network Monitoring with Flow Data appeared first on The OpenNMS Group, Inc..
]]>OpenNMS Meridian is a powerful enterprise network monitoring solution, thanks in part to its flexibility and ability to connect to and monitor a wide range of network devices through network flow data. By providing comprehensive insights into device status, traffic patterns and performance metrics, Meridian empowers your organization to make data-driven decisions.
The platform’s ability to conduct detailed health checks on network devices and generate customized alerts ensures prompt issue resolution and minimal downtime. And its advanced analytics capabilities enable anomaly detection, safeguarding networks against security threats like DDoS attacks.
This blog digs into the concepts of network flow data and how OpenNMS Meridian ingests and integrates it to deliver a robust network monitoring solution.
OpenNMS Meridian is not just a monitoring tool, but a strategic asset for enterprise, ensuring networks are not only healthy and secure but also well-prepared for future growth and challenges.
Network flow data is a way to collect the meta information of a packet entering or leaving a network device. OpenNMS can leverage that metadata with the help of analytics to provide insight on sources, destinations, types of traffic, and the quality of service.
You can think of flow data like airport departure and arrival logs. Consider a network as an airport where data packets are like departing and arriving flights. Flows can be likened to the departure and arrival logs that track various details of each flight, such as the airline flight number, departure, city arrival, city passenger count, and flight duration.
Network engineers face many challenges. It can be difficult to diagnose performance issues, especially in real-time, as users encounter them. A network engineer must look at many aspects of the potential issue. Is this an application issue, or is it the user's device, or could it actually be something on the network?
Even when it has been determined to be an issue with the network, an admin will typically go through a list of questions before they can even begin the troubleshooting process:
Network elements for data flows consist of exporters, devices like routers, switches, and firewalls; collectors, in our case OpenNMS Meridian; and management and analytics applications, also OpenNMS Meridian. It's useful to understand that while flows are often exported from networking equipment like routers and switches, flow data can also come directly from servers, and they can export telemetry data through the sFlow protocol.
OpenNMS is a collector for data flows. It leverages something called telemetryd or the telemetry daemon to ingest flow information from exporters. The telemetry daemon provides a framework to handle sensor data pushed to Meridian. The framework supports applications that use different protocols to transfer metrics. By default, we use a single port listener, which works for many flow protocols like jFlow, sFlow, and several versions of NetFlow, including IPFIX. With telemetryd, operators can define listeners supporting different protocols to receive the telemetry data and adapters transferring the received data into generic formats like flows or performance data.
OpenNMS stores flow records into Elasticsearch using an OpenNMS-created plugin that installs into your Elasticsearch cluster. You can set up persistence policy for Elasticsearch to only keep flow records for a specific period of time. OpenNMS enriches flow records by using information it already has about systems and its inventory. It tags data flows and groups them based on rules. OpenNMS can leverage node data and the metadata associated with nodes like categories to enrich flow records. This enrichment adds context to speed time to resolution when troubleshooting and can enhance forensic network analysis.
OpenNMS uses a classification engine that applies rules to filter and classify flows. The flow classification engine bundled with Meridian is adapted from IANA standards and includes a predefined set of rules that define more than 6,200 applications for basic communication protocols. You can classify flows by a combination of parameters, including source and destination, port source and destination address IP protocol, and exporter.
Conversations can be identified based on classified flow traffic between a set of hosts, information about quality of service provided by DSCP (Differentiated Services Code Point), and OpenNMS can enrich information about the specific host involved in a flow. Classifications help you determine how flows are associated with a particular appliance service or other component and how they affect your network.
For example, Bitcoin traffic on Port A 333 or all flows to Port AD marked as HTTP. Meridian allows for customized rules to classify flows. A rule includes a name for the classification or application and additional parameters such as source and destination ports and addresses that must match.
OpenNMS overcomes the challenge of collecting large amounts of flow data by distributing collection across Minions. Minions are a lightweight stateless service used to add capacity and reduce the load on Meridian core. By enabling horizontal scaling of collection, minions can be deployed as virtual machines, containers, hardware appliances, or physical servers.
Meridian overcomes the challenges of enriching large amounts of flow data by distributing processing workload through sentinels. Sentinels are purpose-built modules that allow Meridian to scale flow enrichment horizontally by transparently offloading work and placing enriched flow records directly into Elasticsearch. Sentinels can be installed on containers, virtual machines, or physical hardware and have requirements similar to Meridian.
OpenNMS can scale to your network size needs. Some of our customers ingest and enrich over 350,000 flows per second.
In summary, Meridian can ingest many types of flow and telemetry data. With the analytics Meridian provides for sources, destinations, types of traffic, and the quality of service, it's possible to gain holistic insight into your network. Using the powerful Meridian flow enrichment capabilities, you can quickly pinpoint the sources of networking issues and reduce time to resolution. OpenNMs Meridian can be deployed on a single host and can scale with your needs by using Minions and sentinels to enable collecting and enriching large amounts of data.
If you have a complex or demanding use case for flows, then get in touch. We're here to help.
The post Enterprise Network Monitoring with Flow Data appeared first on The OpenNMS Group, Inc..
]]>The post Network Monitoring Solutions: Your Complete Guide appeared first on The OpenNMS Group, Inc..
]]>In today's hyper-connected world, where billions of users interact with websites, applications, and services every second, network monitoring solutions are critical to ensuring your network operations stay consistent and reliable. When deployed effectively, they provide continuous surveillance of your network traffic, performance, and security, allowing your team to proactively identify potential trouble spots and address issues, from slow loading times to security threats, before they disrupt services.
With the right network monitoring solution in place, you can not only enhance the performance and security of your networks but also deliver a reliable and seamless internet experience for your users.
Network monitoring solutions give administrators real-time visibility into network activity, enabling them to detect and respond to anomalies promptly. By monitoring for unusual patterns or suspicious activity, network monitoring solutions can help prevent cyber-attacks such as malware infections, DDoS attacks, and data breaches, which can cripple an organization's online presence and reputation.
Network monitoring solutions can also help you optimize network performance by targeting and mitigating bottlenecks and congestion points. Network administrators can ensure that resources are allocated efficiently, preventing slowdowns and traffic blockages that can impede user experience.
Network monitoring solutions should include various features to empower network administrators to monitor and maintain their networks more efficiently.
Network Traffic Monitoring: Track the flow of data within a network, capturing information about the volume, sources, and destinations of traffic. This helps identify normal behavior patterns and anomalies that may indicate network issues or security threats. Administrators can detect and resolve issues such as bandwidth congestion, network bottlenecks, and unauthorized access attempts.
Performance Monitoring: Performance monitoring involves tracking the performance of network devices, servers, and applications to confirm they are operating efficiently. This includes monitoring metrics such as response times, packet loss, and resource utilization. This helps identify and address issues that can impact user experience, such as slow application performance, network latency, and server downtime. Administrators can optimize network resources, improve reliability, and enhance overall user satisfaction.
Security Monitoring: Security monitoring involves detecting and responding to security threats within the network. This includes monitoring for malware infections, unauthorized access attempts, and data breaches. Techniques include intrusion detection systems (IDS), intrusion prevention systems (IPS), and log analysis to identify and mitigate security threats. Administrators are empowered to protect sensitive data, maintain regulatory compliance, and prevent costly security breaches.
Alerting and Reporting: Alerting and reporting features notify administrators of critical events or issues within the network and provide detailed reports on network performance and security. Alerts can notify administrators via email, SMS, or other means when predefined thresholds are exceeded, or anomalies are detected. Reports provide valuable insights into network performance trends, security incidents, and service level agreements (SLAs) compliance. Timely alerts and reports enable administrators to respond quickly to issues, track network performance, and make informed network management and optimization decisions.
Configuration Management: Configuration management features help ensure network devices are optimized for performance and security. This includes managing device configurations, applying configuration changes, and auditing configurations for compliance with best practices and security policies. Configuration management helps reduce the risk of misconfigurations that can lead to network downtime, security vulnerabilities, and performance issues.
Scalability and Flexibility: Network monitoring solutions should accommodate the size and complexity of your network. This includes the ability to monitor a large number of devices and network segments, as well as the ability to integrate with other systems and tools. Administrators can adapt the monitoring environment to meet changing network requirements and integrate with existing network infrastructure and management tools. Monitoring capabilities should grow with the network and meet its evolving needs.
Network monitoring solutions must collect and process tens of thousands of data points per second from a variety of network devices. And networks are not static: the volume of data to be processed increases as your network expands. This changes with fluctuations in network traffic, peak hours, and other factors. Network monitoring solutions that can scale dynamically to collect and process large volumes of data help administrators respond to the most current issues promptly.
These are just some example data sources:
Traffic Data: Network traffic, including the volume of traffic, the types of traffic (e.g., web traffic, email traffic), and the sources and destinations of traffic, lets administrators monitor network performance, detect anomalies, and identify potential security threats.
Performance Metrics: Performance metrics such as CPU usage, memory usage, disk I/O, and network bandwidth from network devices, including routers, switches, and servers, so administrators can monitor the health and performance of network devices and identify and resolve performance issues.
Configuration Data: Configuration data from network devices such as routers and switches help administrators ensure that network devices are properly configured for optimal performance and security.
Event Logs: Event logs from network devices, such as syslog messages, provide information about events and activities on network devices, such as configuration changes, security incidents, and system errors. Event logs help administrators troubleshoot issues, monitor network activity, and detect security threats.
Security Data: Firewall logs and intrusion detection/prevention system (IDS/IPS) alerts help administrators monitor network security, detect and respond to threats, and ensure that security policies are enforced.
User Activity Data: User login/logout events and application usage help administrators monitor user behavior, track user activity, and detect unauthorized access attempts.
Application Performance Data: Some network monitoring solutions can collect data on application performance, such as response times, transaction times, and error rates. This data helps administrators monitor the performance of critical applications and identify and resolve issues that may impact application performance.
Virtualized Environment Data: Data from virtualized environments, such as hypervisors and virtual machines, helps administrators monitor the performance and health of virtualized resources and ensure they’re operating efficiently.
Cloud Services Data: Data from cloud service providers, such as performance metrics, usage data, and security logs, help administrators monitor the performance, efficiency, and security of cloud services.
Quality of Service (QoS) Data: Network monitoring solutions may collect data on Quality of Service (QoS) metrics, such as latency, jitter, and packet loss, to ensure that QoS requirements are being met.
Network Topology Data: Network monitoring solutions may collect data on network topology, including the physical layout of network devices and the connections between devices. This data helps administrators visualize the network layout, identify future bottlenecks or single points of failure, and optimize network performance.
Network monitoring solutions typically consist of several key components that work together to enable network admins to monitor and manage their networks. The exact components may be different depending on the solution and its capabilities as well as the specific needs of the organization, but some common components include:
Data Collection Agents: These agents are installed on network devices to collect data such as performance metrics, traffic data, and event logs. Agents may be software-based or hardware-based, depending on the device and the monitoring solution.
Data Collection and Storage: Network monitoring solutions collect and store data from data collection agents. This data includes performance metrics, traffic data, event logs, and other relevant information. The data is typically stored in a database or data repository for analysis and reporting.
Data Analysis and Processing: Network monitoring solutions analyze and process the collected data to generate reports, alerts, and visualizations. This may involve applying algorithms and statistical analysis to the data to identify patterns, anomalies, and trends.
Alerting and Notification: Network monitoring solutions provide alerting and notification capabilities to alert administrators of critical events or issues. Alerts can notify administrators via email, SMS, or other means when predefined thresholds are exceeded, or anomalies are detected.
Reporting and Visualization: Network monitoring solutions provide reporting and visualization tools to help administrators understand network performance and status. Reports may include summaries of network performance, trends over time, and details of specific events or issues.
Configuration Management: Some network monitoring solutions include configuration management capabilities to manage the configuration of network devices. This may include backing up and restoring configurations, applying configuration changes, and auditing configurations for compliance with best practices and security policies.
Integration and APIs: Network monitoring solutions may offer integration with other systems and tools through APIs (Application Programming Interfaces). This allows administrators to integrate network monitoring data with other systems, such as ticketing systems, logging systems, and automation tools.
Regardless of the solution you choose, implementing a network monitoring solution consistently requires strategy, planning, measuring results. These best practices are a good place to start.
Define Clear Monitoring Objectives: Determine what aspects of the network you need to monitor (e.g., performance, availability, security) and what specific metrics are most relevant to your organization's goals.
Regularly Review and Update Monitoring Configurations: Network monitoring is not a set-it-and-forget-it task. Regularly review and update your monitoring configurations to ensure they align with your organization's changing needs and priorities. This includes adding new devices to be monitored, updating monitoring thresholds, and adjusting alerting settings.
Monitor Key Performance Indicators (KPIs): Focus on monitoring key performance indicators (KPIs) that are most relevant to your organization's goals. This might include network uptime, response times, and bandwidth utilization. Monitoring KPIs can help you identify trends and make informed decisions about network optimization and resource allocation.
Implement Comprehensive Security Monitoring: Network monitoring solutions should include robust security monitoring capabilities to detect and respond to security threats. This includes monitoring for suspicious activity, such as unauthorized access attempts and malware infections, and implementing measures to protect against these threats.
Train Staff on How to Use the Monitoring Solution Effectively: Provide training to staff on how to use the network monitoring solution effectively. This includes understanding how to interpret monitoring data, how to respond to alerts, and how to use the solution's features to troubleshoot network issues.
Integrate Monitoring Data with Other Systems: Network monitoring data can provide valuable insights when integrated with other systems, such as ticketing systems, logging systems, and automation tools. This integration can streamline network management processes and help ensure a coordinated response to network issues.
Review and Improve Monitoring Processes: This might include adding new monitoring checks, refining alerting thresholds, or implementing new monitoring tools or techniques.
As your enterprise network edge expands with more devices, processes, services, and physical locations, so does the challenge of monitoring that distributed environment.
Security, privacy, reachability, and latency issues are more prevalent in highly distributed networks. This makes monitoring, collecting, and processing large volumes of data increasingly difficult.
Reaching and monitoring infrastructure, services, and applications located in remote sites within large, distributed networks can be challenging from a central location, such as a data center or the cloud. Specific roadblocks include firewalls, network address translation (NAT) traversal, overlapping IP address ranges, and locked-down environments.
A distributed environment requires an even more sophisticated and advanced network monitoring solution. It must be deployable in a dispersed configuration to reach into systems and networks that would otherwise be inaccessible while keeping the monitoring logic centralized for easier operation and administration.
To adapt to these changing environmental demands and ensure maximum uptime and optimal performance of your network, distributed network monitoring solutions should deliver:
OpenNMS Meridian provides a comprehensive enterprise network monitoring solution that empowers you to ensure the availability and performance of critical network services. It’s a highly scalable open-source network monitoring solution to address all your distributed network challenges:
The post Network Monitoring Solutions: Your Complete Guide appeared first on The OpenNMS Group, Inc..
]]>The post SNMP: Why it’s Still the Best Monitoring Protocol pt. 2 appeared first on The OpenNMS Group, Inc..
]]>SNMP transports messages utilizing User Datagram Protocol (UDP). Unlike the alternative Transmission Control Protocol (TCP), this protocol doesn't require a connection to the receiver's network to deliver the data package. Network packets are simply sent to a destination without establishing any connection with the receiving network. The UDP connectionless protocol works more like the United States Post Office: a sender puts information in the envelope, addresses it, and the USPS drops the packet in the receiver's mailbox. The receiver doesn’t need to be home, open the door, or let the deliverer in.
On the other hand, TCP connection-oriented protocol works as if the sender put the content in an envelope, drove to a house, knocked on the door, came inside, showed their credentials, then handed the receiver the letter and confirmed they understood it.
The security risks are easy to see here. How can you be certain the sender is who you think it is? How do you know it’s not Tom Cruise in a Mission Impossible mask or a Terminator impersonating someone you know? Ensuring safety in this scenario requires a level of authentication that you can simply avoid by never opening the door. In other words, allowing outsiders access to your network opens the door to risk.
In addition, SNMPv3 can include encryption on the contents of the envelope.
Non-UDP protocols rely on the authentication of the sender and receiver to ensure the information is delivered to the right place, but the data itself is not protected. SNMPv3 can provide connectionless delivery with data encryption on the contents, making both the delivery and the data more secure.
Two SNMP features we discussed earlier also make the protocol more scalable: its consistent, standard data structure and its connection-less transfer format.
Getting started with a non-standard protocol may be easy and fast, but scaling and maintaining becomes harder and harder. The other protocols available today don't follow a universal standard, so they’re very unsophisticated and difficult to scale. Prometheus, for example, uses a fairly specified JSON encoding format for its own purposes, but that structure isn't defined anywhere, so you could basically send anything, and the receiver will have to do all the work on their side to unravel and understand the information.
The connectionless UDP communication model also supports scalability by enabling asynchronous delivery. In a connection-style protocol like TCP, the sender and receiver exchange communication parameters, and then a session is opened in each network and firewall to maintain the session until the conversation is concluded. This means that the sender asks for something and then waits for a response, asks for something / waits for a response, and so on. Keeping multiple connections open ties up network and receiver resources and time. One company could be maintaining 3- or 400,000 open connections at any one time. That’s a massive overhead on the network.
In an asynchronous format like UDP, the firewalls and the routers just send the packet and forget about it. Then, when the other side gets it, and it needs to send a response, they'll just send the response. The firewalls and networks aren’t maintaining sessions; no one is waiting. Network overhead is eliminated because SNMP and UDP (and OpenNMS) are not connection-oriented and are asynchronous in nature.
We hope you’ve found this review of the SNMP standard enlightening. This sometimes taken-for-granted protocol still has much to offer today's modern network monitoring challenges. While bandwidth may be bountiful now, it never hurts to leverage lightweight solutions like SNMP to avoid ever worrying about consumption. Its highly structured format lets network monitor tools know exactly what to expect. That and its connectionless communication format support security and scalability.
That's a ton of SNMP network monitoring value in one very small packet.
The post SNMP: Why it’s Still the Best Monitoring Protocol pt. 2 appeared first on The OpenNMS Group, Inc..
]]>The post February 2024 Releases – Horizon 33.0.1, 2023.1.13, 2022.1.25, 2021.1.36 appeared first on The OpenNMS Group, Inc..
]]>In February, we released updates to all OpenNMS Meridian versions under active support, as well as Horizon 33.0.1.
Meridians 2023.1.13, 2022.1.25, 2021.1.36 contains a bunch of small bugfixes along with other improvements.
For a list of changes, see the release notes:
Release 33.0.1 is a re-release of 33.0.0 with a rollback of a change related to asynchronous polling. Horizon 33 contains a bunch of changes, including metadata support in many more configs, a revamped node list, and more...
For a high-level overview of what has changed in Horizon 33, see What’s New in OpenNMS Horizon 33.
For a complete list of changes, see the changelog.
The codename for Horizon 33.0.1 is Coast Redwood.
The post February 2024 Releases – Horizon 33.0.1, 2023.1.13, 2022.1.25, 2021.1.36 appeared first on The OpenNMS Group, Inc..
]]>The post SNMP: Why it’s Still the Best Monitoring Protocol pt. 1 appeared first on The OpenNMS Group, Inc..
]]>When networking was in its infancy, as with many new technologies, there were many competing approaches to handling the infrastructure that made the physical connections as well as the protocols that ran on those physical connections. Novell and Microsoft, for example, were big proponents of their own protocols. But in the end, the common Internet Protocol, or IP standard, won out. With that came a big boom in the networking industry.
Once there was a connectivity standard, the need for a complementary standard for monitoring the data and programs moving over networks became clear. A group of network engineers created a task force to develop, publish, and ratify the standard that became SNMP or “Simple Network Management Protocol.” By 1990, the Internet Architecture Board (IAB) approved SNMP v1 as the Internet standard for network management but continued to optimize the standard through more iterations:
Some of us already know that the "S" stands for "simple," but why was simple a critical component? When SNMP was designed 40 years ago, bandwidth was at a premium. In 1984, total global Internet traffic was 15 Gigabytes per month. Monitoring network traffic could be a little like quantum mechanics: because monitoring tools were so heavy, you couldn’t observe the traffic without affecting the results.
Back in the late 90s, monitoring could make up 20% of network traffic. That’s why the protocol had to be so lightweight: you couldn’t get people to use the protocol or even monitor the network if monitoring took up too much bandwidth.
Today, SNMP traffic doesn't even appear in monitoring reports because the footprint is so small compared to the available bandwidth and network usage. That’s the advantage of utilizing a protocol designed when bandwidth was a premium in a time when bandwidth is a commodity.
However, “simple” means it is lightweight in terms of bandwidth, but it turns out the protocol was really only considered simple by network engineers. Monitoring is a complex subject, and no matter how you approach the task, that complexity has to go somewhere. If you simplify the network communications, the complexity gets pushed out to the applications doing the monitoring.
SNMP’s standard specifies five core protocol data units (PDUs) that make data easier to parse on the receiver or application side. Network engineers know every field of that packet and expect it to be structured in a specific way: if there's a certain byte here and another byte there, they know exactly what each byte means, and can decode it on the receiving end. It requires that users be able to understand how to structure and code the request, encrypt the request, get it into that packet, send it over, and then, on the receiving side, do the same thing. Setting up this structured packet requires some complex work.
One approach to avoiding this complexity is to use an unstructured or non-standard protocol. The problem is that this just pushes complexity to the edges. The application or user needs to take all the non-standard data and parse it into a structure that means something. It gets stored in an unstructured data store and then multi-indexed by that technology so that users can actually find and use it. And since it's not a recognized standard, someone could decide to change the data format on a whim, putting you back at square one.
Nobody took the time to abstract SNMP’s operational complexity away because it's a complicated application to write. That was one of the reasons other protocols found a foothold; some people thought, “I don't want to keep training on this SNMP thing – it's too hard.”
However, SNMP is a very simple, lightweight message that can supply huge impacts. With this format, just a few bytes on the network can deliver a surprising amount of data.
As you can see, there’s more to this “simple” protocol than meets the eye. It delivers a light load on bandwidth in a structured, predictable format – with benefits you can only get from a recognized industry standard. Part 2 will dig into how SNMP’s stable standard, connectionless approach, and encryption capabilities deliver security with scalability.
The post SNMP: Why it’s Still the Best Monitoring Protocol pt. 1 appeared first on The OpenNMS Group, Inc..
]]>The post January 2024 Releases – Horizon 32.0.6, 2023.1.12, 2022.1.24, 2021.1.35 appeared first on The OpenNMS Group, Inc..
]]>In January, we released updates to all OpenNMS Meridian versions under active support, as well as Horizon 32.0.6.
Meridians 2023.1.12, 2022.1.24, 2021.1.35 contains a bunch of small enhancements to aid in debugging along with other minor improvements.
For a list of changes, see the release notes:
Release 32.0.6 contains a bunch of changes including UI cleanups, a number of access/role fixes, and other bug fixes, plus a ton of documentation updates and a new feature to show news and tips from an RSS feed in the UI.
For a high-level overview of what has changed in Horizon 32, see What’s New in OpenNMS Horizon 32.
For a complete list of changes, see the changelog.
The codename for Horizon 32.0.6 is Breakcore.
The post January 2024 Releases – Horizon 32.0.6, 2023.1.12, 2022.1.24, 2021.1.35 appeared first on The OpenNMS Group, Inc..
]]>The post From Pirate to Proven: 20 Years in Open-Source Network Management pt. 3 appeared first on The OpenNMS Group, Inc..
]]>Thinking back and looking forwards, what have we learned from the last twenty years of involvement in TM Forum programs and push for open-source network management solutions?
As in the early 2000's when the telecoms bust opened the door to the possibility of open-source solutions, today we are seeing a renewed interest in open source as pressure on margins and competition from Hyperscale cloud providers are forcing service providers to look for more cost-effective solutions.
It seems that new technologies wax and wane in popularity. You can put your heart and soul into a project only to have it marginalized to irrelevance. Every interface initiative has eventually been eclipsed by a new technology so it is important to invest just enough to track changes but not bet the farm on a technology which might be replaced in a couple of years. Several companies learnt that lesson the hard way through the TIP program.
The trick, particularly for a small company is to put enough investment in to signal interest and show thought leadership but be prepared to move on quickly if the work isn't driving business. If, however, customers like the work then the early experiments can be productized quickly.
It appears that the latest interface program has learnt this lesson and has employed some developers to sustain the tools. A clear advantage is that the core open source OpenAPI tools are maintained and used by a community to which the TM Forum contribute rather than own. The small TM Forum sustaining team allows occasional contributors (i.e., the majority of TM Forum members) to make their contribution which is then merged and sustained in the wider code base. One can only hope that this will be a long-term commitment on the TM Forum's part.
A parallel observation would be that the interface program may be trying to do too much for the sustaining efforts to be effective. The first interfaces from this program were pretty much handcrafted and not necessarily consistent in data types or interaction patterns. Later, a more consistent core entity model and newer HATEOAS based design patterns have been introduced. Additionally, plans are afoot to provide AsyncAPI based implementations of some of these same interfaces.
Consequently, there are now over 50 interfaces, each of which need ongoing sustaining effort to migrate them all to the same level of compliance and to fix bugs or add necessary enhancements. Furthermore, these interfaces were originally developed by different expert teams which have since disbanded, and the documentation on the use cases for each interface can be patchy.
All of this potential churn adds uncertainty to adoption. Do I use the currently published version or wait for the latest tooling to be released? Do I treat these interfaces as 'good starting points' for my architecture or as a mandatory set of compliance requirements?
Other advanced initiatives such as the TM Forum canvas are complex and also need reference implementations which are widely visible to drive adoption. It would be very beneficial if the canvas program could contribute upstream to make the canvas a component which is sustained in the Kubernetes ecosystem.
This leads into my final point. Technology creators need to promote the network effects of adoption - i.e., the more people who use a technology the more valuable it becomes.
In particular, it is important that a potential adopter can evaluate a new technology quickly and determine how much effort will be required to learn how to use it properly. Insufficient documentation sends a bad message that the technology is either too difficult or that the creator does not support the technology on an ongoing basis.
While much of the tooling and many of the interfaces are now published on GitHub, the documentation and team communications are still restricted to TM Forum members. I believe this significantly hampers the reach and uptake of the interface work and also prevents contributions from experts outside of the TM Forum community.
The interfaces could have real value for the wider IT community. A more serious commitment to promoting the interfaces through open source would help.
I’ve found that TM Forum has navigated the difficult balance between being both a member trade association and a standards creator. It has managed to remain relevant to its core membership and deliver some very useful technology innovations to the wider industry.
I think the direction is positive and hope that OpenNMS will be able to continue to make tangible contributions towards API development and adoption in the coming years as open source continues to be proven out as a viable approach to open-source network monitoring standards.
The post From Pirate to Proven: 20 Years in Open-Source Network Management pt. 3 appeared first on The OpenNMS Group, Inc..
]]>The post From Pirate to Proven: 20 Years in Open-Source Network Management pt. 2 appeared first on The OpenNMS Group, Inc..
]]>Around 2008, the TM Forum desired to create a more implementation neutral interface technology based on WSDL (Web Services Description Language). The new TM Forum Interface Program was led by Steve Fratini from Telcordia and (unimaginatively) referred to as TIP. The objective of the program was to use model driven engineering to generate WSDL interfaces and PDF documentation directly from the TM Forum SID.
By now open source network management was somewhat on the TM Forum's agenda. The TIP tooling, called the Joint Open-Source Interface Framework (JOSIF) was to be completely open source and was based on Eclipse Tigerstripe, a tool originally developed to help create OSS/J interfaces. However, although the tooling was open source, the resulting interfaces’ artifacts were mostly still restricted to only TM Forum members.
After I finished my doctorate, OpenNMS agreed to sponsor my work with the TIP program. The hope was that OpenNMS could contribute open-source expertise and ultimately implement these interfaces on top of OpenNMS thus exhibiting thought leadership in the TM Forum.
I participated heavily in this program on behalf of OpenNMS for about 3 years and OpenNMS also participated in a number of catalysts demonstrating integration to other vendors using the TIP interfaces.
The TIP interfaces are still available, but like OSS/J, TIP was superseded by later developments. The TIP program ultimately closed because of three factors.
Firstly, it was targeting WSDL, and the industry began to favour ReST interfaces using JSON.
Secondly, with hindsight it was a mistake to try and use the TM Forum Information Framework (SID) UML model as the starting point for interface definition. Using the SID made the tooling complicated to learn and use. We came to realize that the SID as an information model was not optimized for interface definition. OSS/J had avoided this trap by creating an intermediate data model, but TIP mistakenly tried to tie interface definitions directly back to the SID UML.
Thirdly, and perhaps most importantly, the cost of maintaining the open-source TIP tooling was difficult to sustain within the small TM Forum community. Although it was actively promoted to other standards organizations such as the ITU-T, the tooling really never reached critical mass adoption both within and outside the TM Forum.
After their initial enthusiasm, various contributors began to reign back their involvement and sustaining development of the capability fell on fewer and fewer members until eventually the program could not continue.
As TIP was becoming a legacy project, another interface program began to emerge under the leadership of Pierre Gautier. The wider IT industry was becoming aware of a new technology neutral interface definition language called Swagger (now OpenAPI) that had a set of code generators which could generate reference code and documentation. OpenAPI was getting wide adoption and the TM Forum decided to start a program to create a set of ReST / Json interfaces using OpenAPI.
The big difference was that the OpenAPI tooling was commercially supported (by Smart Bear) and already had a wide following outside of the telecoms industry. The TM Forum additionally agreed to fund several developers to contribute to the tooling and adapt it to TM Forum needs.
This reflected the lessons learnt in supporting the TIP tooling where all of the resources came only from individual TM Forum members. Someone needs to centrally own the sustaining of the project, and this can’t fall on individual contributors. Having spent a lot of effort on the previous TIP incarnation which had limited commercial adoption, OpenNMS has been more cautiously involved in the newer interface program.
The post From Pirate to Proven: 20 Years in Open-Source Network Management pt. 2 appeared first on The OpenNMS Group, Inc..
]]>