The Quandary of Event Notification: SNMP traps

Newcomers to IT network operations are often confounded by the complexity of managing, controlling, monitoring, diagnosing and repairing their networks.   As we are all inclined to select the simplest tool, to a networking newcomer, the simplest tool for staying on top of network operations would be an event report.

Devices on the network implement the SNMP protocol, thus, it would seem to make sense that event reports should originate through the SNMP mechanism for generating such reports-- historically, the SNMP trap.  (see Note )  

The Newcomer Expectation

A device sends an SNMP trap.  The management application receives the trap.  The human manager is alerted to the problem and fixes the problem to keep the network running smoothly. 

The Reality of SNMP Traps

The key issue is the "event reaction latency".  This is the time between the event occurring and the manager noticing or learning about the event.  The longer it takes to find out about the event, in order to respond to the event, the less useful the event reporting mechanism.

Unfortunately, it is possible that the management app will never learn about the event.  Why?  Because if the network experiences a communication failure, the manager would never learn about the event.   For example, a server device is failing because of a break in the physical connection to the network.  The server may generate an SNMP trap,  but since the physical connection is broken, the SNMP trap cannot get through and will not be delivered to the management app. This would be the worse case situation.

Another alternative is to use polling. In polling, the management app checks the status of the managed devices on the network periodically.  If there is no response from a device, the management app concludes that an event occurred.  The management app would indicate this status to the human manager.  There are several theories on how often to poll, how to limit network resource consumption during polling, and so on. 

Recommendation

Because SNMP uses the User Datagram Protocol (UDP), an unreliable transport protocol, one should not solely rely on SNMP traps to know the status of network operations, especially in mission critical environments.

In order to be fully aware of the network conditions, the manager should also periodically poll all the managed nodes in the network to check their status.  In other words, the recommended approach is to combine notification monitoring with periodic polling.

Additional Resources

Book: Understanding SNMP MIBs by David Perkins and Evan McGinnis

SilverCreek SNMP Test Suite. provides comprehensive, automated testing for all SNMPv1, v2c, and v3 Notifications, including SNMP traps and informs.

Note:

In 2018, most devices should be equipped with SNMPv3 and its event reporting mechanisms which includes authentication of the Notification sender. The key SNMP Notification types are:

• SNMP Traps (unacknowledged event reports)
• SNMP Informs (acknowledged event reports)

Previous Post Next Post