Preface

The server guys just came in the room and said it's your problem: the network is slow. People have been calling the help desk since early this morning (it's now lunch) complaining that the sales order application has been extremely slow or unreachable. You looked at your Network Management Station earlier on and noticed nothing particularly wrong. Yet, the server guys claim it's the network, not their servers.

Also seated in the room are your boss, her boss, and the boss's boss. Orders aren't being placed and everyone now seems to think the fault lies in the network; it's your problem now.

This is a common situation with people who design and manage networks. When something goes wrong, no matter what it is, the first reaction is to blame the network. Maybe the network is to blame for the problem, maybe it is not. To know for sure, you will need to have implemented some level of performance and fault management techniques that help you isolate the cause of a particular performance related problem.

The Meaning of Performance and Fault Management

Performance and fault management encompasses a wide array of tools and topics.

Traditionally, network management tools provided logs of network systems messages (traps, syslog) and colored statuses of each device and interface (green means up, red means down). While these systems were valuable for network troubleshooting, they did not report on or inform an engineer of the health of a network system until it actually went down.

Understanding the activities of a router or switch requires more than knowing whether the device is up or down. You probably want to know some of the following types of information:

  • How much traffic is passing through the interfaces? Is it too much? How much is too much?

  • Is the device CPU busy? How busy? How much is too much?

  • Is the device running out of memory?

Aside from general device health, you probably also want to understand how characteristics of the network devices affect the reliability of the network. For instance, are too many people trying to dial into your ISDN router? Are too many collisions occurring on the Ethernet segments?

How much traffic is too much? How much is normal for the network? These are essential questions that network engineers and managers ask. Questions about how much is too much come into the Cisco TAC, and the answer, much to customer dismay, is “it depends.”

Performance and fault management encompasses issues such as the following:

  • You need to implement performance and fault management, but don't know where to begin.

  • You find people are either ignoring the current network management tools or that the tools are providing useless information.

  • Problem resolution times are taking considerably longer then they should

  • Cisco publishes bunches of MIBs. Where do you begin?

  • You need to select network management tools but aren't quite sure how to do so.

  • You've been told to manage the Frame Relay connections for your 17,000 site network, but you aren't sure what characteristics to look at to determine whether the connections are working right.

  • You want to make sure the network devices are healthy, but don't know what to measure or what is considered acceptable in making a determination.

  • You don't know what traps are available or how to configure your NMS trap receiver to print the trap information in a human readable format.

This book addresses these issues. In addition, it teaches you how to navigate Cisco's documentation, MIBs, and management in order to keep up with the constantly changing pace of managing Cisco devices.

Approach and Objectives

This is not your typical network management book. Most network management books in the market provide an exhaustive explanation of SNMP and its protocols and reprint publicly available MIB documents. While the information is helpful for MIB developers and to some extent practitioners, these types of books tend not to be useful for someone needing to get their management up in a quick and useful manner.

Our experience, both from cases we received in the Cisco TAC and from consulting with customers, has revealed that these books are useful in academic and engineering settings. However, they provide little practical value for those who have to implement the concepts in their networks. The concepts are too general, and the authors typically do not offer experience based recommendations.

Performance and Fault Management departs from the pattern of generic SNMP reprint books. It is a detailed primer on setting up and reporting fault and performance management and a reference that helps Cisco customers cut through the complex management of Cisco devices. Each chapter provides explanations and data only for those things that we feel are important based on our experiences in implementing network management and our know-ledge of Cisco devices.

The objectives of this book are to

  • Provide an overview of router and LAN switch operations in order to help the reader better understand how to manage the devices

  • Provide guidance on the essential MIBs, traps, syslog messages, and show commands critical for managing Cisco routers and LAN switches, including undocumented IOS commands

  • Describe techniques for implementing fault and performance management based on the authors' experiences

  • Help Cisco customers understand and navigate the many MIBs and management interfaces of Cisco devices

After completing this book, you will be able to

  • Design and implement fault and performance monitoring that will measure and report the effectiveness of your Cisco network investment.

  • Generate reports and alerts that report network information and status to management and operations staff.

  • Navigate Cisco's documentation and MIBs in order to determine what elements to manage for a given technology

Audience

Whether you are new to Cisco equipment or are a seasoned Cisco router jockey, this book will help you develop an effective network management strategy. It will also help you identify key MIB values, SNMP traps, syslog messages, and show commands that will assist you with analyzing faults and performance.

This book is intended primarily for network engineers and network management engineers who are responsible for the operation and timely resolution of problems in their corporate network. While this book applies generally to the management of Cisco networks, it was specifically written with medium to large enterprise networks (more than a hundred Cisco devices) in mind.

Specifically, the following people will benefit from this book:

  • Network operations managers can develop an understanding of the process of crafting an effective network management strategy.

  • Network engineers can obtain a detailed understanding of device and technology operations from a management perspective.

  • Network management teams can use this book as a reference when crafting a network management strategy.

  • New Cisco customers will learn about the management capabilities of Cisco devices and have a reference for quickly establishing management of the devices.

  • Customers of other network device vendors can learn how to implement effective performance and fault management. Although some of the MIB references are Cisco specific, the concepts apply to all types of network routers and switches.

  • Network equipment sales engineers, consultants, and developers of commercial network management software will find this book useful for identifying characteristics that must be monitored by or for their products.

The book assumes a working knowledge of network management and thus does not discuss basic network management methodologies. The references at the end of each chapter refer to useful publications that will assist readers who want to learn more about prerequisite topics.

Organization

The book is divided into four parts.

Part I: Foundations, Approaches, and Tools

Part I describes how to learn, document, and implement network management on your network. It is mainly for people who are getting started with their network management strategy and would like guidelines on how to assess the current state of the network and its management. It then explains how network managers can effectively implement monitoring and reporting to assist their teams.

The approach we take with Part I is to assume that you have inherited the network, its processes, and its policies in a certain state. You must first learn and document the network in its current state, and then you can investigate the policies and procedures.

This part of the book also helps existing network management teams determine how they can more efficiently work with the processes and tools they already have. Many engineers simply throw tools at the problem of network management only to find out that the tools alone do not improve their organization to understand network performance and react to network faults.

Please note that some of the techniques and approaches recommended in Part I are ideals that network managers should aim for but that may be difficult to attain. The difficulties arise because some of the concepts require customization with different tools and because some are resource-intensive to implement. A discussion of customization is beyond the scope of this book. However, we've provided sufficient detail for you to learn the necessary concepts and to work with your network management tool vendors or experienced network management implementor.

Part II: Managing Devices and Technologies

Part II is for all audiences and provides concise, detailed information on managing routers and switches. It includes coverage of device and system management, LAN management, and WAN management.

Each chapter provides information on how each of the technologies works, detailed explanations on what the authors consider the most important aspects to manage, and helpful reference to other data that may be useful to network management.

The authors feel that the objectives can be accomplished through conversational instruction using the following method:

  1. Provide brief explanations of each of the manageable technologies. How can you determine what to monitor and set thresholds against if you do not understand the technologies you are trying to manage?

  2. Recommend the top data variables to manage for a technology, providing all of the information that a network engineer will find useful when setting up fault and performance management. No ad-nausea reprints here; we provide you with the top items to watch for and point you to the MIBs posted on the web for more information.

Although some background and context is provided about devices and technologies, it is not the goal of these chapters to provide exhaustive architectural details. Rather, each chapter provides enough example/context information to support the discussion of SNMP MIB objects, SNMP traps, show commands, and syslog messages.

Throughout Part II there are many output Examples from show commands. The relevant lines of output have been annotated and cross-referenced to explanations that follow the Example. This includes the association of relevant SNMP MIB objects to show command output.

Part III: Optimal Management

With this part of the book, you will learn how to optimally configure your Cisco devices for management. Part III also provides a concise listing of frequently asked questions (and answers) that are addressed by this book.

  • Chapter 18, “Best Practices for Device Configuration,” explains how to enable network management capabilities in Cisco devices through the configuration of telnet access, loopback interfaces, NTP, SNMP, RMON, and syslog logging. For each of these technologies, real life configuration examples are given as well as explanations on how the techniques will help you manage your Cisco devices more effectively.

  • Chapter 19, “Frequently Asked Questions,” provides a list of questions and answers to the most commonly asked performance and fault management questions that are addressed by this book. All question/answer pairs refer back to the chapters and sections that provide more details on the topic. The inside front cover of the book provides a list of the questions addressed in this chapter.

Part IV: Appendixes

The appendixes provide additional information not found in the rest of the book, including

  • Detailed information on how to find, understand, and select from among the Cisco MIBs and traps published on Cisco's website.

  • Details on how to decode an ATM accounting file.

How to Use This Book

This book has been written to provide both instructional and reference information for the performance and fault management of Cisco devices.

Depending on your level of expertise with the management of Cisco devices, we recommend the following reading order:

  • Look at the inside front cover for a summary of issues addressed by the book, and Chapter 19 for quick answers to frequently asked questions.

  • If you are relatively new to network management or inexperienced with Cisco devices you should begin with Part I, which explains general techniques for auditing your existing network and implementing performance and fault management. Then, read Part III, which explains how to configure management on Cisco devices. Finally, look through Part II at the chapters that discuss the technologies found in your network.

  • If you have expert-level knowledge of Cisco devices, read Chapters 47 to ensure that you are familiar with general fault and performance management techniques. Then read Chapter 18 to ensure your understanding of management configuration on Cisco devices. Finally, use Part II as a reference when determining which MIBs, traps, syslog messages, and show commands to use for your network.

  • Sales engineers, consultants, and developers should read Chapter 18 to understand how to configure Cisco devices and then use Part II as a reference for which MIBs, traps, and show commands to use when managing fault and performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.191.169