Determining Performance Data Retention Intervals

The snmpCollect daemon will happily save SNMP historical data into its database until the hard drive capacity reaches zero. Unpredictable things happen to mission-critical systems when their disks fill up, so it’s necessary that you address how long you want online SNMP data to be available.

If you want to support users performing ad-hoc SNMP data collections, then you need to provide the necessary disk space for them. The user is entrusted to behave responsibly. The golden rules of ad-hoc data collection are:

  • massive data collections are generally inappropriate

  • very long term data collections should be modest

  • delete the data collection when the study is over

  • limit rapid polling studies to the session interval

Many NNM systems are configured to collect basic SNMP data on all devices in the management domain as a general service to the user community. By default, an out of the box (OOTB) NNM system has all defined data collections suspended, so no SNMP data is available without intervention by the NNM system administrator. NNM offers ovbackup and the data warehouse feature to backup or trim the historical SNMP data.

Trimming the amount of SNMP historical data is necessary because it will eventually fill the disk volume. A bulging database will also slow down the xnmgraph data display tool to a crawl as it searches for data to plot. Troubleshooters require only recent SNMP data for their tasks. Long-term performance data used by the network engineering staff can be culled from the NNM backup data just as easily.

To perform SNMP data trimming, HP provides a sample UNIX script in the manpage for the snmpColDump application (see Figure 9-4). Most NNM administrators will modify this script to suit their local needs and create a UNIX cron job to run it periodically, say, hourly.

Figure 9-4. A sample SNMP data trimmer script.

This small shell script from the snmpColDump manpage trims data in the 1MinLoadAvg file to 2000 entries. This is a UNIX SNMP variable that measures the average number of processes in the run queue over a one-minute interval. Customize the script by running it against every SNMP data file on the system. Assuming that the NNM system administrator understands the Nyquist Sampling Theorem, the samples in 1MinLoadAvg are probably taken every 30 seconds. Retaining 2000 data samples corresponds to 1000 minutes (about 16.7 hours) of data.

#To keep only the last 2000 entries in file 1MinLoadAvg 

lineno='snmpColDump 1MinLoadAvg | wc -l' 
if [ $lineno -gt 2000 ]; then 
lineno='expr $lineno - 2001' 
else 
lineno=1 
if 
snmpColDump -tTI 1MinLoadAvg | sed -n $lineno’,$p’ | 
awk -F	 '{printf("%d	%d	%s	%lg
", $4, $5, $6, $3)}' | 
snmpColDump -r - 1MinLoadAvg

To speed up the data trimming process on a multiprocessor system, you can launch parallel trimming scripts, with each one assigned to an independent portion of the SNMP database. You’ll notice a dramatic speed improvement. An alternative to using cron to launch the data trimmer is to configure ITO to monitor the size of the database. ITO can automatically execute the trimmer script when necessary.

Let’s review some reasons for retaining SNMP historical data online. Assume the above issues can be mitigated. Perhaps you add more RAID disk stripes, a second SCSI controller, and another CPU to enhance performance. Perhaps you modify the script in Figure 9-4 to resample the data by averaging older five-minute samples into one-hour samples, thus reducing the data volume by a factor of 12. You then benefit from having enough online SNMP historical data that covers the following important periods of any business:

busiest hour of the day

busiest day of the week

busiest day of the month

busiest day of the quarter

busiest day of the year

busiest day of a special event

Troubleshooters can check back to see if the utilization they see now is comparable with that seen at a similar time in the past. For example, historical performance data shows that high network utilization at the end of the month in a sales office is actually normal, as it is at the end of a fiscal quarter.

A final note about long-term SNMP data retention deals with the cost issue of disk drives. At this writing, the cost of an 18-gigabyte internal SCSI disk drive is in the $600 range. Therefore, an 18-gigabyte dualmirrored triple-striped disk array can be built for $3600 plus change. Obviously, you need to choose a computer platform that can accommodate these disks internally or externally, and this increases the price accordingly. But these figures are not outlandish; in fact, for a mission-critical NNM system they are more than acceptable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.69.157