Chapter 4. Planning Your Operations Manager Deployment

<feature><title>In This Chapter</title> <objective>

Assessment

</objective>
<objective>

Design

</objective>
<objective>

Planning

</objective>
<objective>

Proof of Concept

</objective>
<objective>

Pilot

</objective>
<objective>

Implementation

</objective>
<objective>

Maintenance

</objective>
<objective>

Sample Designs

</objective>
</feature>

As you have read in earlier chapters of this book, Operations Manager (OpsMgr) 2007 is definitely a new product compared to Microsoft Operations Manager (MOM) 2005. However, one thing that has not changed is the requirement for effectively planning your deployment before rushing the product into production.

Proper planning of your OpsMgr deployment is key to its success. Just like implementing any other application or technical product, the time spent preparing for your installation is often more important than the time spent actually deploying the product! The majority of technical product deployments that fail do so due to ineffective planning.

Too often for Information Technology (IT) organizations, the thought process is to deploy new products as quickly as possible, but this approach frequently results in not gaining full benefit from those tools. We refer to this as the RSA approach: Ready, Shoot, Aim. Using RSA results in technology deployments not architected correctly, which then require changes, or potentially a complete redeployment to resolve issues identified after the fact. Our recommended technique is RAS:

  • Ready—Are you ready to deploy Operations Manager? Assess your environment to better understand it and where OpsMgr is required.

  • Aim—What is the target you are trying to hit? Based on your assessment, create a design and execute both a proof of concept and a pilot project.

  • Shoot—Implement the solution you designed!

Creating a single high-level planning document is essential because Operations Manager affects all IT operations throughout the enterprise. Many projects fail due to missed expectations, finger pointing, or the “not invented here” syndrome, which occurs when some IT staff members have stakes in preexisting or competitive solutions. You can avoid these types of problems ahead of time by developing a comprehensive plan and getting the backing from the appropriate sponsors within your organization.

A properly planned environment helps answer questions such as the number of OpsMgr management servers you will have, how many management groups to use, the management packs you will deploy, and so forth.

Proper technical deployments require usage of a disciplined approach to implement the solution. Microsoft’s recommended approach for IT deployments is the Microsoft Solutions Framework (MSF). MSF consists of four stages:

  • Envisioning—Development of the vision and scope information.

  • Planning—Development of the design specification and master project plan.

  • Developing—Development and optimization of the solution.

  • Deployment & Stabilization—Product rollout and completion.

Additional information on MSF is available at http://www.microsoft.com/technet/solutionaccelerators/msf. For purposes of deploying Operations Manager 2007, we have adapted MSF to a slightly different format. The stages of deployment we recommend and discuss in this chapter include

  • Assessment—Similar to the Envisioning stage.

  • Design & Planning—Roughly correlates to the Planning & Developing stages of MSF.

  • Proof of Concept, Pilot, Implementation, Maintenance—These stages follow steps similar to the Deployment & Stabilization stages.

The specific stages we will discuss in this chapter include Assessment, Design, Planning, Proof of Concept, Pilot, Implementation, and Maintenance.

Note: Using Both MSF and MOF

Chapter 1, “Operations Management Basics,” discussed Operations Manager 2007’s alignment with the Microsoft Operations Framework (MOF) as an operational framework. Using MOF is an approach to monitoring, which is independent from using a deployment methodology such as the MSF when planning an OpsMgr deployment.

Assessment

The first step in designing and deploying an Operations Manager 2007 solution is to understand the current environment. A detailed assessment gathers information from a variety of sources, resulting in a document that you can easily review and update. This process ensures there is a complete understanding of the existing environment. Although the concept of an assessment document is very common within consulting organizations as part of the process of implementing new technologies, we recommend also using this approach for projects internal to an organization.

The principles underlying the importance of assessments is summed up well by Stephen R. Covey’s Habit #5, “Seek first to understand, then to be understood,” in The 7 Habits of Highly Effective People (Simon & Schuster, 1989). From the perspective of an Operations Manager assessment, this means you should fully understand the environment before designing a solution to monitor it.

A variety of sources is used to gather information for an assessment document, including the following:

  • Current monitoring solutions—These products may include server, network, or hardware monitoring products (including earlier versions of Operations Manager such as MOM 2005 and MOM 2000). You will want to gather information regarding the following areas:

    • What products are monitoring the production environment

    • What servers the product is running on

    • What devices and applications are being monitored

    • Who the users of the product are

    • What the product is doing well

    • What the product is currently not doing well

    Options for existing monitoring solutions include integration with Operations Manager, replacement by Operations Manager, and no impact by Operations Manager.

    Understanding any current monitoring solutions and what functionality they provide is critical to developing a solid understanding of the environment itself.

  • MOM 2005 upgrade/replacement—This is applicable in environments currently using Operations Manager 2005. An in-depth analysis of what functionality MOM 2005 provides is critical to presenting an Operations Manager 2007 design that can effectively replace and enhance the existing functionality.

  • Current Service Level Agreements—A Service Level Agreement (SLA) is a formal written agreement designed to provide the level of support expected (in this case, the SLA is from the IT organization to the business itself). For example, the SLA for web servers may be 99.9% uptime during business hours. Some organizations have official SLAs, whereas others have unofficial SLAs.

    Unofficial SLAs actually are the most common type in the industry. An example of an unofficial SLA might be that email cannot go offline during business hours at all. You should document both the existence and nonexistence of SLAs as part of your assessment to take full advantage of Operations Manager’s capability to increase system uptime.

  • Administrative model—Organizations are either centralized, decentralized, or a combination of the two. The current administrative model and plans for the administrative model help determine where OpsMgr server components may best be located within the organization.

  • Integration—You can integrate Operations Manager 2007 with a variety of solutions, including help desk/problem management solutions and existing monitoring solutions. Some of the available connectors provide integration from OpsMgr to products such as BMC Remedy, HP OpenView, and Tivoli TEC. Chapter 22, “Interoperability,” discusses the various connectors that are available for Operations Manager 2007.

    If OpsMgr needs to integrate with other existing solutions, you should gather details including the name, version, and type of integration required.

  • Service dependencies—New functionality provided within Operations Manager 2007 makes it even more important to be aware of any services on which OpsMgr may have dependencies. These include but are not limited to local area network (LAN)/wide area network (WAN) connections and speeds, routers, switches, firewalls, Domain Name Server (DNS), Active Directory, instant messaging, and Exchange. A solid understanding of these services and the ability to document them will improve the design and planning for your Operations Manager 2007 deployment.

  • Functionality requirements—You will also use the assessment to gather information specific to the functionality required by an OpsMgr environment. You will want to determine what servers Operations Manager will monitor (what domain or workgroup they are in), what applications on these servers need to be monitored, and how long to retain alerts.

    While gathering information for your functionality requirements, concentrate on the applications OpsMgr will be managing. As we discussed in Chapter 2, “What’s New,” OpsMgr’s focus centers on modeling applications and identifying their dependencies. Identifying these applications and mapping out their dependencies will provide information important to your Operations Manager 2007 design.

  • Business and technical requirements—What technical and financial benefits does OpsMgr need to bring to the organization? The business and technical requirements you gather are critical because they will determine the design you create. For example, if high server availability is a central requirement, this will significantly impact your Operations Manager design (which we discuss in Chapter 5, “Planning Complex Configurations”). It is also important to determine what optional components are required for your Operations Manager monitoring, audit collection, and reporting environments. Identify, prioritize, and document your requirements; you can then discuss and revise them until you finalize your requirements.

Gather the information you collect into a single document, called an assessment document. This document should be reviewed and discussed by the most appropriate personnel within your organization who are capable of validating that all the information contained is correct and comprehensive. These reviews often result in, and generally should result in, revisions to the document; do not expect that a centrally written document will get everything right from the get-go. Examine the content of the document, particularly the business and technical requirements, to validate they are correct and properly prioritized. After reaching agreement on the document content, move the project to the next step: designing your OpsMgr solution.

Design

The assessment document you created in the previous stage now provides the required information to design your new Operations Manager 2007 environment. As with other technology projects, a best practice approach is to keep the design simple and as straightforward as possible.

Do not add complexity just because the solution is cool; instead, add complexity to the design only to meet an important business requirement! For example, it is best not to create a SQL cluster for OpsMgr reporting functionality unless it is determined that there is a business requirement for high availability with OpsMgr Reporting. Business requirements are critical because they drive your Operations Manager design. Rely on your business requirements to determine the correct answer whenever there is a question as to how you should be designing your environment.

The starting point for designing an Operations Manager environment is the management group.

Management Groups

As introduced earlier in this book, an Operations Manager management group consists of the Operations database, Root Management Server (RMS), Operations Manager consoles (Operations, Web, Authoring), optional components (additional management servers, reporting servers, data warehouse servers, Audit Collection Services, database servers, gateway servers), and up to 5000 managed computers. Start with one management group and add more only if more than one is necessary. For most cases, a single management group is the simplest configuration to implement, support, and maintain.

Note: The OpsMgr Authoring Console

The Authoring console will be available with Operations Manager 2007 Service Pack (SP) 1. We will discuss it in more detail in Chapter 23, “Developing Management Packs and Reports.”

Exceeding Management Group Support Limits

One reason to add an additional management group is if you need to monitor more than the 5000 managed computers supported in a single management group. The type of servers monitored directly affects the 5000 managed computers limit; there is nothing magical or hard-coded about this number. For example, monitoring many Exchange backend servers has a far more dramatic impact on a management server’s performance than if you were monitoring the same number of Windows XP or Vista workstations.

If the load on the management servers is excessive (servers are reporting excessive OpsMgr queue errors or high CPU, memory, disk, or network utilization), consider adding another management group to split the load.

Separating Administrative Control

Another common reason for establishing multiple management groups is separating control of computer systems between multiple support teams. In MOM 2005, this was often the rationale used to split the security monitoring functionality from the application/operating system monitoring functionality. For most organizations, the new Audit Collection Services (ACS) functionality introduced in Chapter 2 should remove the need to split out security events into a separate management group.

However, let’s look at an example where the Application support team is responsible for all application servers, and the Web Technologies team is responsible for all web servers, and each group configures the management packs that apply to the servers it supports.

With a single Operations Manager 2007 management group, each group may be configuring the same management packs. In our scenario, the Application support team and Web Technologies team are both responsible for supporting Internet Information Services (IIS) web servers. If these servers are within the same management group, the rules in the management packs are applicable to each of the two support groups. If either team changes the rules within the IIS management pack, it may impact the functionality required by the other team.

Although there are ways to minimize this impact using techniques such as overrides, in some situations you will want to implement multiple management groups. This typically occurs when multiple support groups are supporting the same management packs. In a multiple management group solution, each set of servers has its own management group and can have the rules customized as required for the particular support organization.

Security Model

Historically, multiple management groups were required due to limitations with the MOM security model. MOM 2005 provided very limited granularity in what level of security was available to users who had permission to work within the Administrator console. If a user had access to update rules in the Administrator console, she could not be restricted to specific rule groups or specific functions. OpsMgr 2007’s role-based user security removes the need to create multiple management groups to provide this level of security.

A role in Operations Manager 2007 consists of a profile and a scope. A profile, such as the Operations Manager Administrator or Operations Manager Operator, defines the actions one can perform; the scope defines the objects against which one can perform those actions. What this means is roles limit the access users have.

Although previous versions of Operations Manager often required multiple management groups to separate different groups from performing actions outside their area of responsibility, using roles limits who is able to monitor and respond to alerts generated by different management packs—and can eliminate the need to partition management groups for security purposes. (We discuss security in more detail in Chapter 11, “Securing Operations Manager 2007.”)

As an example, a MOM 2005 organization with one support group for operating systems and a second support group for applications would require multiple management groups for each group to maintain and customize its own rules and not affect the other support group. This is not necessary in OpsMgr 2007 because you can define different user roles for the personnel within each organization.

Dedicated ACS Management Group

Although ACS does not require a separate management group to function, splitting this functionality into its own management group may be required to meet your company’s security requirements. Splitting into a separate management group may be required if your company mandate states that the ACS functionality be administered by a separate group of individuals. For more information on ACS, see Chapter 15, “Monitoring Audit Collection Services.”

Production and Test Environments

We recommend creating a separate test environment for Operations Manager so you can test and tune management packs before deploying them into the production environment. This approach minimizes the changes on production systems and allows for a large amount of testing because it will not affect the functionality of your production systems.

Geographic Locations

Physical location is also a factor when considering multiple management groups. If many servers exist at a single location with localized management personnel, it may be practical to create a management group at that location. Let’s look at a situation where a company—Odyssey—based in Plano has 500 servers that will be monitored by Operations Manager; an additional location in Carrollton has 250 servers that also will be monitored. Each location has local IT personnel responsible for managing its own systems. In this situation, maintaining separate management groups at each location is a good approach.

Note: Management Group Naming Conventions

When you are naming your management groups, the following characters cannot be used:

<LINELENGTH>90</LINELENGTH>
( ) ^ ~ : ; . ! ? ", ' ` @ # %  / * + = $ | & [ ] <>{}

Management groups also cannot have a leading or trailing space in their name. To avoid confusion, we recommend creating management group names that are unique within your organization. Remember that the management group name is case sensitive.

Network Environment

If you have a server location with either minimal bandwidth or an unstable network connection, consider adding a management group locally to reduce WAN traffic. The data transmitted between the agent and a management server (or gateway server) is both encrypted and compressed. The compression ratio ranges from 4:1 to 6:1, approximately. We have found that an average client will use 3Kbps of network traffic between itself and the management server. We typically recommend installing a management server at network sites that have between 30 and 100 local systems. Operations Manager can support approximately 30 agent-managed computers on a 128KB network connection (agent-managed systems use less bandwidth than agentless-managed systems).

Table 4.1 lists the minimum network connectivity speeds between the various OpsMgr components.

Table 4.1. Network Connectivity Requirements

OpsMgr Component 1

OpsMgr Component 2

Network Connectivity

RMS/Management Server

Agent

64Kbps

RMS/Management Server

Agentless

1024Kbps

RMS/Management Server

Operations database

256Kbps

RMS

Operations Console

768Kbps

RMS

Management Server

64Kbps

RMS/Management Server

Data Warehouse

768Kbps

RMS

Reporting Server

256Kbps

Management Server

Gateway Server

64Kbps

Web Console Server

Web Console

128Kbps

Reporting Data Warehouse

Reporting Server

1024Kbps

Console

Reporting Server

768Kbps

Audit Collector

Audit Database

768Kbps

Installed Languages

You must install each server in your management group providing Operations Manager components using the same language. As an example, you cannot install the RMS using the English version of Operations Manager 2007 and deploy the Operations console in a different language, such as Spanish or German. If these components must be available in different languages, you must deploy additional management groups to provide this functionality.

Multiple Management Group Architectures

Implementing multiple management groups introduces two architectures for management groups:

  • Connected management groups—If you are familiar with MOM 2005, multitiered architectures exist when there are multiple management groups with one or more management groups reporting information to another management group. OpsMgr 2007 replaces this type of functionality with the connected management group. A connected management group provides the capability for the user interface in one management group to query data from another management group. The major difference between a multitiered environment and a connected management group is that data only exists in one location; connected management groups do not forward alert information between them.

    For example, let’s take our Odyssey company, which has locations in Plano and Carrollton. For Odyssey, the administrators in the Carrollton location need the autonomy to manage their own systems and Operations Manager environment. The Plano location needs to manage its own servers and also needs to be aware of the alerts occurring within the Carrollton location. In this situation (illustrated in Figure 4.1), we can configure the Carrollton management group as a connected management group to the Plano management group.

    Connected management group.

    Figure 4.1. Connected management group.

  • Multihomed architectures—A multihomed architecture exists when a system belongs to multiple management groups, reporting information to each management group.

For example, Eclipse (Odyssey’s sister company) has a single location based in Frisco. Eclipse also has multiple support teams organized by function. The Operating Systems management team is responsible for monitoring the health of all server operating systems within Eclipse, and the Application management team oversees the business critical applications. Each team has its own OpsMgr management group for monitoring servers. In this scenario, a single server running Windows 2003 and IIS is configured as “multihomed” and reports information to both the Operating Systems management group and the Application management group (see Figure 4.2 for details).

Multihomed architecture.

Figure 4.2. Multihomed architecture.

Another example of where multihoming architectures are useful is testing or prototyping Operations Manager management groups. Using multihomed architectures, you can continue to report to your production MOM management group while also reporting to a new or preproduction management group. This allows testing of the new management group without affecting your existing production monitoring.

Operations Manager Server Components

The number of management groups you have directly impacts the number and location of your OpsMgr servers. The server components in a management group include at a minimum a Root Management Server (RMS), an Operations Database Server, and Operations consoles. Various additional server components are also available within Operations Manager 2007 depending on your business requirements.

Optional server components within a management group include additional Management Servers, Reporting Servers, Data Warehouse Servers, ACS Database Servers, and Gateway Servers. In this section, we will discuss the following components and the hardware and software required for those components:

  • Root Management Server

  • Management Servers

  • Gateway Servers

  • Operations Database Servers

  • Reporting Servers

  • Data Warehouse Servers

  • ACS Database Servers

  • ACS Collector

  • ACS Forwarder

  • Operations console

  • Web Console Servers

  • Operations Manager agents

These server components are all part of a management group as discussed in Chapter 3, “Looking Inside OpsMgr.”

In a small environment, you can install the required server components on a single server with sufficient hardware resources. As with management groups, it is best to keep things simple by keeping the number of Operations Manager servers as low as possible while still meeting your business requirements.

With the new components in Operations Manager 2007, the hardware requirements vary depending on the role each server component provides. We will break down each of the different server components and discuss both their minimum and large-environment hardware requirements.

Root Management Server

The Root Management Server (RMS) Component is the first management server installed. This server provides specific functionality (such as communication with the gateway servers, connected management groups, and password key storage) that we discussed in Chapter 3. It also serves as a management server; as a reminder, management servers provide communication between the OpsMgr agents and the Operations database.

The RMS can run on a single server, or it can run on a cluster. Table 4.2 describes minimum hardware requirements for the RMS; Table 4.3 shows the recommended requirements for larger environments. We recommend clustering if your business requirements for OpsMgr include high-availability and redundancy capabilities. See Chapter 5 for details on supported redundancy options for each OpsMgr component.

Table 4.2. Root Management Server Minimum Hardware Requirements

Hardware

Requirement

Processor

1.8GHz Pentium

RAM

1GB

Disk

10GB space

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

10Mbps

Operating system

Windows 2003 Server SP1 32 bit or 64 bit

Table 4.3. Suggested Root Management Server Hardware Requirements for Large Environments

Hardware

Requirement

Processor

Two Dual-Core processors

RAM

8GB

Disk

10GB Space

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

100Mbps

Operating system

Windows 2003 Server SP1 64 bit

Management Servers

Additional management servers (MS) within the management group can provide increased scalability and redundancy for the Operations Manager solution. The minimum hardware requirements and large-scale hardware recommendations match those of the RMS.

The number of management servers required depends on the business requirements identified during the Assessment stage:

  • If redundancy is a requirement, you will need at least two management servers per management group. Each management server can handle up to 2000 agent-managed computers. With MOM 2005, there was a limit of 10 agentless managed systems per management server. With Operations Manager 2007, there is no documented limit of agentless monitored systems per management server, but each additional agentless monitored system increases the processing overhead on the management server. For details on agentless monitoring, see Chapter 9, “Installing and Configuring Agents.”

    If a management group in your organization needs to monitor more than 2000 computers, install multiple management servers in the management group. If you need to monitor agentless systems, start out with a maximum of 10 agentless managed systems and then track the performance of the system as you increase the number of agentless systems monitored (as an example, from 10 to 20). In addition, a good practice is to split the load of the agentless monitoring between management servers in the environment.

  • Each management group must have at least one management server (the RMS) in that management group.

Using a sample design of 1000 monitored computers for the purposes of our discussion, we would plan for multiple management servers to provide redundancy. Each management server needs to support one-half the load during normal operation and the full load during failover situations. Although the management server does not store any major amounts of data, it does rely on the processor, memory, and network throughput of the system to perform effectively.

When you design for redundancy, plan for each server to have the capacity to handle not only the agents it is responsible for but also the agents it will be responsible for in a failover situation. For example, with a two–management server configuration, if either management server fails, the second needs to have sufficient resources to handle the entire agent load.

The location of your management servers also depends on the business requirements identified during the Assessment stage:

  • Management servers should be separate from the Operations database in anything but a single-server OpsMgr management group configuration. In a single-server configuration, one server encompasses all the applicable Operations Manager components, including the RMS and the Operations database.

  • Management servers should be within the same network segment where there are large numbers of systems to monitor with Operations Manager.

  • If the RMS and Operations database are on the same server, we recommend monitoring no more than 200 agents on that server because it can degrade the performance of your OpsMgr solution. If you will be monitoring more than 200 agents, you should split these components onto multiple servers, or use an existing SQL Server (even a SQL cluster!) for the Operations database, thus lowering your hardware cost and increasing scalability. We discuss clustering in Chapter 5.

Tip: Using Existing SQL Systems for the OpsMgr Databases

The question arises about leveraging an existing SQL Server database server by using it for the OpsMgr databases. Using existing hardware can lower hardware costs and increase scalability.

As a rule, we do not recommend this because there are high processing requirements on both the Operations and Data Warehouse databases, and they perform better if they have their own server. However, in small environments or if you want to add a significant amount of memory to an existing system (or if you have a lot of available processing and memory resources), you may consider installing the OpsMgr databases on an existing SQL system.

Gateway Servers

Install gateway servers in untrusted domains or workgroups to provide a point of communication for servers in untrusted domains or workgroups. Gateway servers enable systems not in a trusted domain to communicate with a management server. The large-environment hardware recommendations (shown in Table 4.4) vary from those of management servers by the number of processors, required amount of memory, and operations system recommendations. Redundancy is available by deploying multiple gateway servers, which is the same approach used with management servers.

Table 4.4. Suggested Gateway Server Hardware Requirements for Large Environments

Hardware

Requirement

Processor

One Dual-Core processor

RAM

2GB to 4GB

Disk

10GB space

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

100Mbps

Operating system

Windows 2003 Server SP1 32 bit or 64 bit

Operations Database

The Operations Database Component stores all the configuration data for the management group and all data generated by the systems in the management group. We previously discussed this server component and all other components in Chapter 3.

Place the Operations database on either a single server or a cluster; we recommend clustering if your business requirements for OpsMgr include high-availability and redundancy capabilities. Each management group must have an Operations database. The Operations database (by default named OperationsManager) is critical to the operation of your management group—if it is not running, the entire management group has very limited functionality. When the Operations database is down, agents are still monitoring issues, but the database cannot receive that information. In addition, consoles do not function and eventually the queues on the agent and the management server will completely fill.

As a rule of thumb, the more memory you give SQL Server, the better it performs. Using an Active/Passive clustering configuration provides redundancy for this server component; the Operations database does not currently support Active/Active clustering. See Chapter 10, “Complex Configurations,” for details on supported cluster configurations.

We recommend that you not install the Operations database with the RMS in anything but a single-server Operations Manager configuration. Although this server component can coexist with other database server components, this is not a recommended configuration because it may also cause contention for resources, resulting in a negative impact on your OpsMgr environment.

Disk configuration and file placement strongly affects database server performance. Configuring all the OpsMgr database server components with the fastest disks available will significantly improve the performance of your management group. We provide additional information on this topic, including recommended Redundant Arrays of Inexpensive Disks (RAID) configurations, in Chapter 10.

Tables 4.5 and 4.6 discuss database hardware requirements.

Table 4.5. Operations Database Minimum Hardware Requirements

Hardware

Requirement

Processor

1.8GHz Pentium

RAM

1GB

Disk

20GB space

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 Resolution

Mouse

Needed

Network

10Mbps

Operating system

Windows 2003 Server SP1 32 bit or 64 bit

Table 4.6. Suggested Operations Database Hardware Requirements for Large Environments

Hardware

Requirement

Processor

Two Dual-Core processors

RAM

8GB

Disk

100GB space, spread across dedicated drives for the operating system, data, and transaction logs

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

100Mbps

Operating system

Windows 2003 Server SP1 64 bit

Note: Measuring Database Server Performance

The time it takes Operations Manager to detect a problem and notify an administrator is a key performance metric; this makes alert latency the best measure of performance of the OpsMgr system. If there is a delay in receiving this information, the problem could go undetected. Because of the criticality of this measure, OpsMgr SLAs tend to focus on alert latency.

Reporting Servers

The Reporting Server Component uses SQL 2005 Reporting Services to provide web-based reports for Operations Manager. The Web Reporting Server Component typically runs on a single server. This server is running SQL 2005 with Reporting Services, so its hardware requirements match those discussed previously for the Operations Database Server Component. Redundancy is available for the Reporting servers by using web farms or Network Load Balancing (NLB), but there are issues associated with keeping the reports in sync. Chapter 5 includes information on redundant configurations.

Data Warehouse Servers

The OpsMgr agent data goes through the management server and writes to the data warehouse, which provides long-term data storage for Operations Manager. This data provides daily performance information gathering and longer-term trending reports.

The Data Warehouse server hosts the Data Warehouse database (by default named OperationsManagerDW) that OpsMgr uses for reporting. The Data Warehouse Component typically runs on a single server but can run on a cluster if high availability and redundancy are required for reports.

For reasons similar to the requirements for the operational database, this server should not exist on the same system hosting RMS.

The hardware requirements for Data Warehouse database servers match those of Operations database servers, with the added requirement being additional storage for the database and log files on the server.

ACS Database Servers

Audit Collection Services provides a method to collect records generated by an audit policy and to store them in a database. This centralizes the information, which you can filter and analyze using the reporting tools in Microsoft SQL Server 2005. The ACS Database Server Component provides the repository where the audit information is stored. The hardware requirements for ACS database servers match those of the Operations database servers, with a requirement of additional storage for the database and log files on the server.

ACS Forwarder

The ACS Forwarder Component uses the Operations Manager Audit Forwarding service, which is not enabled by default. After the service is enabled, the ACS Collector Component captures all security events (which are also saved to the local Windows NT Security Event log). Because this component is included within the OpsMgr agent, the hardware requirements mirror those of the agent. Chapter 6, “Installing Operations Manager 2007,” discusses hardware requirements for OpsMgr.

ACS Collector

The forwarders send information to the ACS Collector Component. The collector receives the events, processes them, and sends them to the ACS database. The hardware requirements for the collector mirror those of the RMS. Currently there are no options available to provide a high-availability solution for the ACS Collector Component.

Operations Console

As we discussed in Chapter 2, the Operations console provides a single user interface for Operations Manager. All management servers, including the RMS, should install this console. You can also install the console on desktop systems running Windows XP or Vista. Installing these consoles on another system removes some of the load from the management server. Desktop access to the consoles also simplifies day-to-day administration.

The number of consoles a management group supports is an important design specification. As the number of consoles active in a management group grows, the database load also grows. This load accelerates as the number of managed computers increases because consoles, either operator or web-based, increase the number of database queries on both the Operations Database Server and the Data Warehouse Server Components. From a performance perspective, it is best to run the Operations consoles from administrator workstations rather than from servers running the OpsMgr components. Running the Operations console on the RMS increases memory and processor utilization, which in turn will slow down OpsMgr.

The hardware requirements for this component match those of the Gateway Server Component.

Web Console Server

The Web Console Server Component runs IIS and provides a web-based version of the Operations console. The hardware requirements for this server match those of the Gateway Server Component. You can configure redundancy for the Web Console Server Component by installing multiple Web Console Servers and leveraging Network Load Balancing (NLB) or other load-balancing solutions.

Operations Manager Agents

There are no Operations Manager–specific hardware requirements for systems running the OpsMgr agent. The hardware requirements are those of the minimum hardware requirements for the operating system itself.

Tip: Processor Performance with OpsMgr Server Components

Multi-core 64-bit-capable processors provide your best performance option for the various Operations Manager server components.

Operations Manager Server Software Requirements

Operations Manager 2007 supports a variety of operating systems and requires additional software as installation prerequisites. We discuss these in the following sections.

Operating Systems

Operating system requirements for the OpsMgr components vary somewhat because the various components have different operating system requirements. Table 4.7 through Table 4.10 list the various operating systems and their suitability for the Operations Manager server components. Tables 4.7, 4.8, and 4.9 show the supported operating systems for each of the different server components. Table 4.7 shows management server components, Table 4.8 shows database server components, 4.9 shows ACS–related server components, and 4.10 shows the console and agent components.

Table 4.7. Management Server Component Operating System Support

Operating System

Root Management Server

Management Server

Gateway Server

MS with Agentless Exception Monitoring File Share

Microsoft Windows Server 2003, Standard Edition SP1 X86 and X64

X

X

X

X

Microsoft Windows Server 2003, Enterprise Edition SP1 X86 and X64

X

X

X

X

Microsoft Windows Server 2003, Datacenter Edition SP1 X86 and X64

X

X

  

Microsoft Windows XP Professional X86 and X64

    

Vista Ultimate X86 and X64

    

Vista Business X86 and X64

    

Vista Enterprise X86 and X64

    

Table 4.8. Database Server Component Operating System Support

Operating System

Operations Database

Reporting Server

Data Warehouse

Microsoft Windows Server 2003, Standard Edition SP1 X86, X64, and IA64

X

X

X

Microsoft Windows Server 2003, Enterprise Edition SP1 X86, X64, and IA64

X

X

X

Microsoft Windows Server 2003, Datacenter Edition SP1 X86 X64, and IA64

X

X

X

Microsoft Windows XP Professional X86 and X64

   

Vista Ultimate X86 and X64

   

Vista Business X86 and X64

   

Vista Enterprise X86 and X64

   

Table 4.9. Audit Collection Server Component Operating System Support

Operating System

ACS Database Server

ACS Collector

Microsoft Windows Server 2003, Standard Edition SP1 X86 and X64

X

X

Microsoft Windows Server 2003, Enterprise Edition SP1 X86 and X64

X

X

Microsoft Windows Server 2003, Datacenter Edition SP1 X86 and X64

X

 

Microsoft Windows XP Professional X86 and X64

  

Vista Ultimate X86 and X64

  

Vista Business X86 and X64

  

Vista Enterprise X86 and X64

  

Table 4.10. Console and Agent Operating System Support

Operating System

Operations Console

Web Console

Agent

Microsoft Windows Server 2003, Standard Edition SP1 X86 and X64

X

X

X

Microsoft Windows Server 2003, Enterprise Edition SP1 X86 and X64

X

X

X

Microsoft Windows Server 2003, Datacenter Edition SP1 X86 and X64

X

 

X

Microsoft Windows Server 2003, Professional Edition SP2 X86 and X64

  

X

Microsoft Windows Server 2003, Standard Edition SP1 IA64

  

X

Microsoft Windows Server 2003, Enterprise Edition SP1 IA64

  

X

Microsoft Windows Server 2003, Datacenter Edition SP 1 IA64

  

X

Microsoft Windows Server 2003, Professional Edition SP 2 IA64

  

X

Microsoft Windows XP Professional X86 and X64

X

 

X

Vista Ultimate X86 and X64

X

 

X

Vista Business X86 and X64

X

 

X

Vista Enterprise X86 and X64

X

 

X

Microsoft Windows Server 2000 SP4

  

X

Microsoft Windows 2000 Professional SP4

  

X

The edition of the operating system chosen for each of the server components should correspond with the hardware that will be available for that component. As an example, the Windows Server 2003 Standard Edition supports up to four processors and 4GB of memory. If you purchase hardware to scale beyond that, use Windows Server 2003 Enterprise Edition instead (or use the X64 version of Windows Server 2003 Standard Edition).

Planning for Licensing

Part of your decision regarding server placement should include evaluating licensing options for Operations Manager 2007. Each device managed by the server software requires an Operations Management License (OML), whether you are monitoring it directly or indirectly. (An example of indirect monitoring would be a network device such as a switch or router.) There are two types of OMLs:

  • Client OML—Devices running client operating systems require client license OMLs.

  • Server OML—Devices running server operating systems require management license OMLs.

To determine licensing costs, it is important to understand that there are also two different levels of OMLs available:

  • Standard OML—Provides monitoring for basic operating system workloads

  • Enterprise OML—Provides monitoring for all operating system utilities, service workloads, and any other applications on that particular device

If you plan to monitor servers that run on virtual server technology, an OML is only required for the physical server. This must be an enterprise OML.

Table 4.11 lists the products associated with the standard and enterprise OML levels. (This list is complete as of mid–2007.)

Table 4.11. Levels of Management Licenses and Products

License Type

Monitoring Capability

Standard OML

System Resource Manager

 

Password Change Notification

 

Baseline Security Analyzer

 

Reliability and Availability Services

 

Print Server

 

Distributed File System (DFS)

 

File Replication Service (FRS)

 

Network File System (NFS)

 

File Transfer Protocol (FTP)

 

Windows SharePoint Services

 

Distributed Naming Service (DNS)

 

Dynamic Host Configuration Protocol (DHCP)

 

Windows Internet Naming Service (WINS)

 

Windows Operating System 2000/2003 Operating System Management Pack

Enterprise OML

BizTalk Server 2006

 

Internet Security and Accelerator (ISA) Server 2006

 

Microsoft Exchange Server 2003/2007 Management Pack

 

Microsoft SharePoint Portal Server (SPS) 2003 Management Pack

 

Microsoft SQL Server 2000/2005 Management Pack

 

Microsoft Windows Server 2000/2003 Active Directory Management Pack

 

Microsoft Windows Server Internet Information (IIS) 2000/2003 Management Pack

 

Operations Manager 2007 Servers (other than the server OML required for each management server)

 

Office SharePoint Server 2007

 

Virtual Server 2005

 

Windows Group Policy 2003 Management Pack

You need to acquire an OpsMgr 2007 license for each Operations Manager server (RMS, management servers, and gateway servers; OpsMgr 2007 licenses for the Operations database, Reporting database, or other components are not required) as well as OMLs for managed devices as discussed earlier in this section.

When placing the Database Server Component separate from the Management Server Component, consider the following licensing aspects:

  • No Client Access Licenses (CALs) are required when you are licensing SQL Server on a per-processor basis.

  • CALs are required for each managed device when you are licensing SQL Server per user.

  • If the SQL Server is licensed using the System Center Operations Manager 2007 with the SQL 2005 Technologies license, no CALs are required. The SQL license in this case is restricted to supporting only the OpsMgr application and databases. No other applications or databases can use that instance of SQL Server.

Tip: Licensing for SNMP Devices

An OML is only required to manage network infrastructure devices if the device provides firewall, load balancing, or other workload services. OMLs are not required to manage network infrastructure devices such as switches, hubs, routers, bridges, or modems.

There has been a lot of confusion concerning what licensing is required to use the AEM functionality of Operations Manager. Using AEM requires a client license for each system from which AEM will collect crash information. So if you want to use AEM in your environment and you have 2000 workstations that you want to use AEM on, 2000 client OMLs are required. The exceptions to this license model include the following:

  • When the client device is also licensed for the Microsoft Desktop Optimization Pack for Software Assurance (MDOP for short), an OML is not required to use AEM on the device. The MDOP offering we are concerned with here is System Center Desktop Error Monitoring, or DEM for short. Either an OML or MDOP allows the device to use AEM.

  • Corporate Error Reporting (CER) was previously a Software Assurance (SA) benefit that was part of DEM. Customers with Software Assurance on Windows Client, Office, or Server products were entitled to use CER. Customers with Software Assurance on preexisting agreements can use DEM for the remainder of their agreement. After the current agreement expires, they will either need to purchase an OML or MDOP (DEM) if they want to continue to use AEM.

  • You do not need to purchase a client OML in addition to a standard or enterprise OML. As an example, if you have a Windows server (which has a standard OML), it is not necessary to purchase a client OML for this system to monitor it with AEM.

As an example of how this would work, let’s look at how we would license Operations Manager 2007 for the Eclipse Company. Eclipse is a 1000-user corporation with one major office. The company has decided to deploy one management group with a single management server. Eclipse is interested in monitoring 100 servers, which includes 30 servers requiring monitoring with domain controller, Exchange, SharePoint, or SQL functionality.

From a licensing perspective, Eclipse will be purchasing the following:

  • One server OML (one for each OpsMgr management server)

  • Thirty enterprise client OMLs (one for each Active Directory, Exchange, SharePoint, or SQL Server system)

  • Seventy standard client OMLs (for the remaining servers covered by the Enterprise Client OML licenses)

To help estimate costs, Microsoft has provided a pricing guide for Operations Manager 2007, available at http://www.microsoft.com/systemcenter/opsmgr/howtobuy/default.mspx. This guide lists the following prices for the components we just discussed:

  • Operations Manager Server 2007: $573 U.S.

  • Operations Manager Server 2007 with SQL Server Technology: $1,307 U.S.

  • Enterprise OML: $426 U.S.

  • Standard OML: $155 U.S.

  • Client OML: $32 U.S.

For our licensing example, our estimate would be

  • Operations Manager Server 2007: 1 × $573 = $573

  • Enterprise OML: 30 × $426 = $12,780

  • Standard OML: 70 × $155 = $10,850

The total estimate for Operations Manager licensing in this configuration is $24,203. Note, however, that this figure is a ballpark estimate only. Your specific licensing costs may be higher or lower based on the license agreement your organization has as well as your specific server configuration.

For more details on Operations Manager 2007 licensing, see the Microsoft Volume Licensing Brief, which is available for download at http://go.microsoft.com/fwlink/?LinkId=87480. Updates to the list of basic workloads are available at http://www.microsoft.com/systemcenter/opsmgr/howtobuy/opsmgrstdoml.mspx.

Additional Software Requirements

Table 4.12 shows the additional software requirements for each of the different Operations Manager database–related components.

Table 4.12. Additional Software Requirements

Server Component

RMS, MS, Gateway Server

Operations Database, Reporting Data Warehouse

Audit Collection Database

Reporting Server

Ops Console

Web Server Console

Microsoft SQL Server 2005 SP1 Standard

 

X

    

Microsoft SQL Server 2005 Enterprise

 

X

X

   

Microsoft SQL Server Reporting Services with SP1

   

X

  

.NET Framework 2.0

X

   

X

X

.NET Framework 3.0

X

  

X

X

X

Microsoft Core XML Services (MSXML) 6.0

X

     

Windows PowerShell

    

X

 

Office Word 2003 (for .NET programmability support)

    

X

 

Internet Information Services 6.0

     

X

ASP.NET 2.0

     

X

Additional Design Factors

You have a variety of other factors to consider when designing your operations management environment. These include the servers monitored by OpsMgr, distributed applications, management packs, security, user notifications, and agent deployment considerations. We discuss these in the following sections.

Monitored Servers

As part of the assessment phase, you should have collected a list of servers to monitor with Operations Manager. As part of the design, you have identified each server and, based on its location, determined the management server it will use. You can now use your management group design to match the servers and their applications with an appropriate management group and management server.

User Applications

As we discussed in Chapter 2, distributed applications have taken on a much more important role within OpsMgr. Categorize the applications identified in the Assessment phase to determine whether you can model them as distributed applications. You should also compare this list with the available management packs on the System Center Pack Catalog website (http://go.microsoft.com/fwlink/?linkid=71124).

Management Packs

Your design should also include the management packs you plan to deploy, based on the applications and servers identified during the assessment phase. As you are identifying management packs for deployment, it is important to remember that certain management packs have logical dependencies on others. As an example, the Exchange management pack is logically dependent on the Active Directory management pack, which in turn is logically dependent on the DNS management pack. To monitor Exchange, you should plan to deploy each of those management packs.

You can import MOM 2005 management packs into Operations Manager 2007 after converting them to the new OpsMgr 2007 format. Converted management packs do not have the full functionality of management packs designed for Operations Manager 2007; they do not become model-based as part of the conversion process. Converted management packs also do not include reports because Microsoft rewrote the reporting structure in OpsMgr 2007.

Security

Operations Manager 2007 utilizes multiple service accounts to help increase security by utilizing lower privileged security accounts. Chapter 11 discusses how to secure your OpsMgr environment. Identifying these accounts and their required permissions is necessary for an effective Operations Manager design. OpsMgr uses several accounts that are typically Domain User accounts:

  • SDK and Config Service account—Provides a data access layer between the agents, consoles, and database. This account distributes configuration information to agents

  • Management Server Action account—Gathers operational data from providers, runs responses, and performs actions such as installing and uninstalling agents on managed computers

  • Agent Action account—Gathers information and runs responses on the managed computer

  • Agent Installation account—Used when prompted for an account with administrator privileges to install agents on the managed computer(s)

  • Notification Action account—Creates and sends notifications

  • Data Reader account—Executes queries against the OpsMgr Data Warehouse database

  • Data Warehouse Write account—Assigned permissions to write to the Data Warehouse database and read from the Operations database

User Notifications

Operations Manager’s functionality includes the ability to notify users or groups as issues arise within your environment. For design purposes, you should document who needs to receive notifications and what circumstances necessitate notification. OpsMgr can notify via email, instant messaging, Short Message Service (SMS), or command scripts. Notifications go to recipients, configured within Operations Manager. As an example, if your organization has a team that supports email, the team will most likely want notifications if issues occur within the Exchange environment. (In this particular instance because the team is supporting email, it would be advisable to provide multiple Exchange servers that can send the notification or use other available notification methods in the event that Exchange is down.)

Reporting

Operations Manager 2007 includes a robust reporting solution. With the reporting components installed, OpsMgr directly writes relevant information from the management server to the Data Warehouse database. This is a significant change from MOM 2005, which uses a DTS package to move data from the OnePoint database and transfers that information into the SystemCenterReporting database. Writing data directly into the reporting database enables near real-time reporting. It also decreases the space required within the database because only specific information is written, versus shipping all information over a certain age into the Data Warehouse database.

Agent Deployment

Monitored systems will have agents deployed to them. Deployment can be manual, automatic from the Operations console, or through automated distribution methods such as Active Directory, Microsoft Systems Management Server, and the recently released System Center Configuration Manager. For small to mid-sized organizations, we recommend deploying agents from the Operations console because it is the quickest and most supportable approach. For larger organizations, we recommend Systems Management Server/Configuration Manager or Active Directory deployment. Manual agent deployment is most often required when the system is behind a firewall, when there is a need to limit bandwidth available on the connection to the management server, or if a highly secure server configuration is required. Chapter 9 provides more detail on this topic. For the purposes of the Design stage, it is important to identify the specific approach you plan for distributing the agent.

System Center Capacity Planner 2007

Microsoft’s System Center Capacity Planner (SCCP), shown in Figure 4.3, can provide you with the answers you need to design your OpsMgr environment. The System Center Capacity Planner, designed to architect your Operations Manager solution, provides the ability to create “what-if” analysis, such as “What-if I add another 100 servers for monitoring by OpsMgr?”

System Center Capacity Planner.

Figure 4.3. System Center Capacity Planner.

You can use the System Center Capacity Planner to provide a starting point for your OpsMgr 2007 design because it can provide you with answers needed for an effective design of your environment. According to discussions during the beta of the SCCP, we expect database-sizing estimates to be included within the released version of the SCCP. We recommend using database-sizing estimates from the SCCP as the final authority for sizing once that capability is available.

In the next few sections, we will provide sample designs for a single-server configuration, dual-server configuration, and finally a mid-to-enterprise-sized configuration.

Designing a Single-Server Monitoring Configuration with System Center Essentials

You can create single-server configurations either using Microsoft System Center Essentials or Microsoft Operations Manager 2007, depending on the business requirements identified for your organization.

Microsoft System Center Essentials (or just Essentials) is designed to provide core operations management capabilities for small to medium-sized businesses. System Center Essentials combines the monitoring functionality of Operations Manager 2007 with the patch management functionality of Windows Software Update Services (WSUS) 3.0. However, Essentials is a different product that has specific restrictions on its functionality, including the following:

  • Although System Center Essentials can run in a multiserver configuration (with the RMS and database server on separate computers), all monitored systems must be in the same Active Directory forest.

  • There is no fault tolerance support (clustering) for the database server or management server.

  • Essentials does not include a data warehouse or fixed grooming schedule, and report authoring is not available.

  • No role-based security.

  • No Web console, support for ACS, or PowerShell integration.

  • Essentials can only monitor a total of 30 servers (31 including itself), and up to 500 non-server operating systems. Essentials can monitor an unlimited number of network devices.

Single-Server Requirements with Essentials

Because the Essentials components typically are on a single server, the design process is greatly simplified. Figure 4.4 shows an example of this configuration. System Center Essentials is always a single–management server implementation; the only supported alternative option for Essentials puts the SQL database component on a second server when managing between 250 and 531 computers (remember, only 30 of those can be running a server operating system!).

System Center Essentials single-server configuration.

Figure 4.4. System Center Essentials single-server configuration.

A single-server configuration is the simplest Operations Manager design. Assuming an average level of activity, your server should be two dual-core processors or higher with 4GB of memory to achieve optimal results (shown in Table 4.13). The goal is to avoid taxing your processor by having a sustained CPU utilization rate of more than 75%. Consequently, a configuration of less than 30 managed nodes using the proper hardware should not exceed this threshold. Ensure that the tests you perform include any management packs you might add to Essentials because management packs can easily add 25% to your server CPU utilization.

Table 4.13. System Center Essentials Hardware Recommendations

Hardware

Requirement

Processor

Two Dual-Core processors

RAM

4GB

Disk

250GB space that’s spread across dedicated drives for the operating system, data, and transaction logs

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

100Mbps

Operating system

Windows 2003 Server SP1 64 bit

To provide sufficient server storage, we recommend approximately 250GB of disk storage available on the Essentials Server. Table 4.14 lists various methods for configuring drive space.

Table 4.14. Single-Server Drive Configuration

Drive Size

Number of Drives

Configuration

Usable Space

36GB

8

RAID5

252GB

72GB

5

RAID5

288GB

146GB

3

RAID5

292GB

300GB

2

RAID1

300GB

We recommend always placing the database on a disk array separate from the operating system; combining them can cause I/O saturation. Essentials also works best on a 100Mb (or 1Gb) network and connected to a switch (preferably on the backbone with other production servers).

Server Components for Essentials

In the single-server configuration, the server runs all components of the typical Essentials installation, including the management and database components.

Single-Server Configuration Using Operations Manager 2007

You can also design Operations Manager to run in a single-server configuration. This is a good option when you need to monitor more than 30 servers but less than 100 servers or when there are requirements to monitor more than 500 non-server operating systems.

Single-Server Requirements with OpsMgr

As with many other technical systems, using the simplest configuration meeting your needs works the best and causes the fewest problems. This is also true of OpsMgr; many smaller-sized organizations can design a solution composed of a single Operations Manager server that encompasses all required components. Certain conditions must exist in your infrastructure to make optimal use of a single-server OpsMgr installation. If the following conditions are present, this type of design scenario may be right for your organization:

  • Monitoring less than 100 servers but more than 30 servers (see the “System Center Essentials” section of the chapter for less than 30 servers).

  • The number and type of management packs directly affect the hardware requirements for the OpsMgr solution. A general recommendation for a single-server configuration is not to deploy more than a dozen management packs and to make sure that they are fully tuned.

  • Maintaining one year or less of stored data.

Generally, a single Operations Manager server configuration works for smaller to medium-sized organizations and particularly those who prefer to deploy Operations Manager to smaller, phased groups of servers. Another advantage of Operations Manager’s architecture is its flexibility concerning changes in design scope. You can add more component servers to a configuration later without the worry of major reconfiguration.

Figure 4.5 shows the single-server configuration with all the components on one server.

Ops Mgr single-server configuration.

Figure 4.5. Ops Mgr single-server configuration.

Server Components for OpsMgr

In the single-server configuration, the server runs components of a typical Operations Manager installation, including the management and database components.

Designing a Two-Server Monitoring Configuration

Sometimes the scalability Microsoft builds into its products makes it challenging to determine how to deploy a multiple-server solution. We will cut through that and show you where you can deploy a relatively simple Operations Manager solution.

Operations Manager can scale from the smallest office to the largest, worldwide enterprises. Consequently, you must make decisions as to the size and placement of the OpsMgr servers. As part of the design process, an organization must decide between implementing a single server or a deployment using multiple servers. Understanding the criteria defining each design scenario aids in selecting a suitable match.

Two-Server Requirements

Figure 4.6 shows the two-server configuration with the OpsMgr components split between two servers (one to provide management server functionality, one to provide database functionality).

OpsMgr two-server configuration.

Figure 4.6. OpsMgr two-server configuration.

Using the SCCP to design the Operations Manager solution (using 200 servers to be monitored), we can identify the recommended hardware for the environment, as shown in Figure 4.7, which illustrates a two-server configuration splitting the OpsMgr components.

Operations Manager SCCP two-server configuration.

Figure 4.7. Operations Manager SCCP two-server configuration.

Assuming an average level of activity, your servers should be two dual-core processors or higher with 4GB of memory to achieve optimal results (as shown in Table 4.15). The goal is to avoid taxing your processor by having a sustained CPU utilization rate of more than 75%. Consequently, a configuration of fewer than 100 managed nodes should not exceed this threshold if you use the proper hardware. Remember to test with any management packs that you might add to OpsMgr because management packs can easily add 25% to your server CPU utilization.

Table 4.15. Operations Manager Two-Server Hardware Recommendations

Hardware

Requirement

Processor

Two Dual-Core processors

RAM

4GB

Disk

Management Server: 10GB space

 

Database Server: 150GB spread across dedicated drives for the operating system, data, and transaction logs.

CD-ROM

Needed

Display adapter

Super VGA

Monitor

800×600 resolution

Mouse

Needed

Network

100Mbps

Operating system

Windows 2003 Server SP1 64 bit

Now let’s apply some numbers to this. We will use a standard growth rate of 5MB per managed computer per day for the Operational database (discussed in the “Operations Database Sizing” section), and 3MB per managed computer per day on the Data Warehouse database (see the “Data Warehouse Sizing” section of this chapter). The total space for the year of stored data is 100 agents * 3MB/day * 365 days, which equals 109,500MB, or approximately 100GB. The Operational database stores data for 7 days by default, which translates to 100 agents * 5MB/day * 7 days, which equals 3500MB, or approximately 4GB. The Operational database no longer has the 30GB sizing restriction that existed with MOM 2005, but it is still prudent to allocate adequate space for it to grow to that size or greater. Approximating once again, we arrive at 250GB total space required. Table 4.16 lists a variety of drive sizes, the RAID configuration, and the usable space. This table assumes that the number of drives includes the parity drive in the RAID, but not an online hot spare.

Table 4.16. OpsMgr Database Server Drive Configuration

Drive Size

Number of Drives

Configuration

Usable Space

36GB

8

RAID5

252GB

72GB

5

RAID5

288GB

146GB

3

RAID5

292GB

300GB

2

RAID1

300GB

You should always place the database on a disk array separate from the operating system because combining them can cause I/O saturation. OpsMgr also works best on a 100Mb (or 1Gb) network, connected to a switch (preferably on the backbone with other production servers).

Server Components

In the two-server configuration, the two servers split the components of the typical Operations Manager installation, based on function. The first server provides management server functionality, and the second provides all database functionality. See Chapter 6 for the specific steps for installing a two-server configuration.

Designing Multiple Management Server Monitoring Configurations

Although it is often simpler to install a single Operations Manager management server to manage your server infrastructure, it may become necessary to deploy multiple management servers if you require the following functionality:

  • Monitoring more than 200 servers.

  • Collecting and reporting on more than a month or two of data.

  • Adding redundancy to your monitoring environment.

  • Monitoring computers across a WAN or firewall using the Gateway Server Component.

  • Providing centralized auditing capabilities for your environment using the Audit Collection Services functionality.

  • Segmenting the operational database to another server. (We discuss this in the “Designing a Two-Server Monitoring Configuration” section of this chapter.)

In multiple-server configurations, you will split the OpsMgr components onto different physical servers. These components, introduced earlier in the “Design” section of this chapter, include the following:

  • Root Management Server

  • Management Servers

  • Gateway Servers

  • Operations Database Servers

  • Reporting Servers

  • Data Warehouse Servers

  • ACS Database servers

  • ACS Collector

  • ACS Forwarder

  • Operations console

  • Web Console Servers

  • Operations Manager agents

You can have more than one server configured for each of these server roles, depending on your particular requirements.

Multiple-Server Requirements

It is generally a good practice to include high levels of redundancy with any mission-critical server, such as the RMS, Operations consoles, and database servers. We recommend you include as many redundant components as feasible into your server hardware design and choose enterprise-level servers whenever possible.

Disk space for the Operations database server is always a consideration. As with any database, it can consume vast quantities of drive space if left unchecked. Although Microsoft has removed the historical limit of 30GB on the Operations database, we still recommend that you limit the size of the Operations database to maximize performance and to minimize backup timeframes. The disk drive storing the Operations database should have at least slightly more than double the total size of the database in free space to support operations such as backups and restores, as well as in case of emergency growth of the Operations database. In addition, backups of the entire database on a regular basis are a must. We discuss backups in detail in Chapter 12, “Backup and Recovery.”

You can configure the Data Warehouse server in a manner similar to the Operations database server, but the Data Warehouse database requires a much larger storage capacity. Whereas the Operations database typically stores data for a period of days, the Data Warehouse database typically stores the data for a year. This data can grow rapidly; we discuss this in the “Designing OpsMgr Sizing and Capacity” section later in this chapter.

Placement of the Management Servers

For optimal performance, place management servers close to their database servers or to the agents they manage. However, several key factors can play a role in determining where the servers will reside:

  • Maximum bandwidth between components—OpsMgr servers should have fast communication between all components in the same management group to maximize the performance of the system. The management server needs to be able to upload data quickly to the Operations database. This usually means T1 speed or better, depending on the number of agents the management server is supporting.

  • Redundancy—Adding management servers increases the failover capability of OpsMgr and helps to maintain a specific level of uptime. Depending on your organization’s needs, you can build additional servers into your design as appropriate.

  • Scalability—If a need exists to expand the OpsMgr environment significantly or increase the number of monitored servers with short notice, you can establish additional OpsMgr management servers to take up the slack.

In most cases, you can centralize the management servers in close proximity with the Operations database and allow the agents to communicate with the management servers over any slow links that might exist.

As a rule, for multiple-server OpsMgr configurations, you should have at least two management servers for redundancy.

Using Multiple Management Groups

As defined in Chapter 3, Operations Manager management groups are composed of a single SQL Server operational database, a unique name, the RMS, and optional components such as additional management servers. Each management group uses its own Operations database and maintains its own separate configuration. As an example, two management groups can have the same management pack (MP) installed but different overrides configured within the MP. OpsMgr 2007 allows multiple management groups to write to a single Data Warehouse database (unlike MOM 2005, where the reporting environment could only have one management group report directly to the Reporting database).

It is important to note that you can configure agents to report to multiple management groups and connect management groups to each other. This increases the flexibility of the system and allows for the creation of multiple management groups based on your organization’s needs. There are five major reasons for dividing your organization into separate management groups:

  • Geographic or bandwidth-limited regions—Similar in approach to Windows 2003 sites or Exchange 2003 routing groups, you can establish Operations Manager management groups in segregated network subnets to reduce network bandwidth consumption. We do not recommend spanning management groups across slow WAN links because this can lead to link saturation (and really upset your company’s network team). Aligning management groups on geographic or bandwidth criteria is always a good idea.

    The size of the remote office has to justify creating a new management group. As an example, a site with fewer than 10 servers typically does not warrant its own management group. The downside to this is a potential delay in notification of critical events.

  • Functional or application-level control—This is a useful feature of Operations Manager, enabling management of a single agent by multiple management groups. The agent keeps the rules it gets from each of the management groups completely separated and can even communicate over separate ports to the different management groups.

  • Political or business function divisions—Although not as common a solution as bandwidth-based management groups, OpsMgr management group boundaries can be aligned with political boundaries. This would normally only occur if there was a particular need to segregate specific monitored servers to separate zones of influence. For example, the Finance group can monitor its servers in its own management group to lessen the security exposure those servers receive.

  • Very large numbers of agents—In a nutshell, if your management group membership approaches the maximum number of monitored servers, it is wise to segment the number of agents into multiple management groups to improve network performance. This is appropriate when the number of managed computers approaches 5000 or when the database size increases to a level that decreases performance or the database cannot effectively be backed up and restored. In MOM 2005, the maximum supported database size was 30GB (including 40% free space for indexing). In OpsMgr 2007, there is no supported size limit, but the best-practice approach is to keep it less than 40GB, or even 30GB if you are able to do so. We have heard of transient issues when the Operations database is greater than 40GB.

  • Preparing for mergers—Organizations expecting company mergers or with a history of mergers can use multiple management groups. Using multiple management groups enables quick integration of multiple OpsMgr environments.

Let’s illustrate this point using the bandwidth-limited criteria. If your organization is composed of multiple locations separated by slow WAN links, it is best to separate each location not connected by a high-speed link into a separate management group and set up connected management groups.

Another example of using multiple management groups is when IT support operations functionally is divided into multiple support groups. Let’s take an example where there is a platform group (managing the Windows operating system) and a messaging group (managing the Exchange application). In this instance, two management groups might be deployed (a platform management group and a messaging management group). The separate management groups would allow both the platform group and the messaging group to have complete administrative control over their own management infrastructures. The two groups would jointly operate the agent on the monitored computers.

The Operations Database—Placement and Issues

Keeping in mind that each management group has a separate database, you should note that, as opposed to management servers, a management group has only one Operations database. Consequently, keep the following factors in mind when placing and configuring each SQL Server installation:

  • Network bandwidth—As with the other OpsMgr components, it is essential to place the Operations database on a well-connected, resilient network that can communicate freely with the management server. Slow network performance can significantly affect the capability of OpsMgr to respond to network conditions.

  • Hardware redundancy—Most enterprise server hardware contains contingencies for hardware failures. Redundant fans, RAID mirror sets, dual power supplies, and the like will all help to ensure the availability and integrity of the system. SQL Server databases need this level of protection because critical online systems require immediate failover for accessibility.

    If you have high-availability requirements, you should cluster and/or replicate the database to provide redundancy and failover capabilities.

  • Backups—Although some people consider high availability to be a replacement for backups, setting up your OpsMgr environment for hardware redundancy can be an expensive proposition. You should establish a backup schedule for all databases used by Operations Manager as well as the system databases used by SQL Server. We discuss backups in Chapter 12.

  • SQL Server licensing—Because SQL Server is a separate licensing component from Microsoft, each computer that accesses the database must have its own Client Access License (CAL), unless you are using per-processor licensing for the SQL Server. As we discussed in the “Planning for Licensing” section of this chapter, if you use System Center Operations Manager 2007 with the SQL 2005 Technologies license, no CALs are required. However, remember that no other databases or applications can use that instance of SQL Server.

    Licensing can be an important cost factor. Database clustering requires SQL Server Enterprise Edition.

With MOM 2005, the Operations database server transferred data to the reporting database on a daily basis using a DTS package. In OpsMgr 2007, writes occur directly from the management server to the Data Warehouse database rather than using a daily scheduled DTS package. In OpsMgr 2007, solid network connectivity should exist between the Operations database server and the Data Warehouse database server.

Designing OpsMgr Sizing and Capacity

Although Operations Manager 2007 contains multiple mechanisms that allow it to scale to large environments, design limitations may apply depending on your specific environment. The number of agents you deploy and the amount of data you want to collect directly impact capacity limitations and the size of your database. A better understanding of exactly what OpsMgr’s limitations are can help to better define which design to utilize.

Data Flows in OpsMgr

Data flow in OpsMgr 2007 is an important design consideration. A typical Operations Manager environment has a large quantity of data flowing from a relatively small percentage of sources within the IT environment. The data flows are latency sensitive, due to needing alerts and notifications in a short timeframe (measured in seconds).

We have found that agent traffic varies depending on the management packs you deploy, ranging from about 1.2Kbps to 3Kbps travelling from the agent to the management server. For our calculations, we use the following formula:

3Kbps * <number of agents> = total traffic (in Kbps)

The data flowing from each agent is not that great when considered individually. However, when looked at in aggregate for a large number of servers, the load can be significant, as illustrated in Table 4.17, which shows the estimated minimum bandwidth for just the Windows operating system base management pack. Using Table 4.17, you can see that the need for multiple 100Mbps or Gigabit network cards becomes important above the 500-agent mark.

Table 4.17. Management Server Aggregate Bandwidth for Agents

Agents

Total Kbps

Utilization of a 100Mbps NIC

1

0.50

0%

5

2.50

0%

10

5.00

1%

100

50.00

5%

500

250.00

25%

1000

500.00

50%

You can adjust the data flow in two ways:

  • How often the agents upload their data

  • How much data you upload

When considering how to adjust the setting for low-bandwidth agents, you can tune the heartbeat and data upload times. However, this will not reduce the overall volume of information. You will be uploading more information at less frequent intervals, but it will be the same quantity of data. To reduce the data volume, you need to adjust the rules to collect less data. As an example, adjusting the sample interval of a performance counter from 15 minutes to 30 minutes will reduce the volume of data by half.

Caution: Don’t Over-tune Those Performance Counters

Performance counters have the most impact on network traffic and the size of the Operations database. Be very careful if you change a performance interval to gather data more often.

Limitations, Provisos, and Restrictions

Management servers, management groups, and collections of management groups have some inherent limitations. In some cases, these are hard limits that cannot be technically broken; others are supportability limits that should not be broken. Microsoft tests and supports its products but sets limits on the scale of the systems that reflect the limits of the testing and support. Table 4.18 summarizes the capacity limitations for OpsMgr components. Although OpsMgr includes implicit design components that allow it to scale to large groups of managed nodes, there are some maximum levels that could limit the size of management groups.

Table 4.18. Limitations Summary as of Operations Manager 2007

Area

Limitation Description

Limit

Management Group

Maximum agents per management group

5000

 

Maximum agentless computers per management group

No limits defined

 

Maximum management servers per management group

No limits defined

 

Maximum consoles per management group

Unlimited

Management Server

Maximum agents per management server

2000

 

Maximum agentless computers per management server

10

Agent

Maximum management servers per agent

4

Database

Maximum size of operations database

None

 

Maximum size of reporting database

None

We have grouped Table 4.18 by the type of limitation. Even though there is a stated limit, in a given environment the limit might not be practical. As an example, the maximum number of agents supported by a management server is 2000. However, a single management server is unlikely to be capable of supporting 2000 Exchange servers due to the particularly heavy load these agents place on it.

Many of these capacity limitations are actually supportability limitations rather than hard technical limitations. For example, monitoring the 5001st agent will not generate an error or alert. Exceeding the limitations does not immediately cause the system to fail but rather starts to affect the performance, latency, and throughput of the OpsMgr infrastructure. For this reason, Microsoft imposes these limitations from a supportability perspective; the Operations Manager 2007 product may not function properly if you exceed those limits.

Doing the Math

Because each management group can support up to a maximum of 5000 agents and there is no restriction on the number of connected management groups, this effectively means that there are no documented scalability limits on OpsMgr 2007. This is an excellent functional increase from MOM 2005, where you could support a maximum of 44,000 managed computers in a single cohesive MOM 2005 infrastructure.

In the next two sections, we will evaluate database sizing, which has also changed significantly since MOM 2005.

Operations Database Sizing

Sizing the Operations Manager 2007 database can be a complex endeavor. You have to take several factors into account and use a complex formula to calculate database size. We suggest purchasing a large quantity of disk space for your OpsMgr database so that the database can increase in size over time if it is required. If space limitations are an issue, you can reduce the size of the database through decreasing the data retention period on the database. By doing some simple math, we can estimate the size of the OpsMgr database.

From reviewing the Operations database, we determined that a fresh installation of OpsMgr 2007 with a 1500MB database size used a total of 510MB of the total database and log size. We reviewed all the tables in the Operations database and tracked what tables increase over time. Through this trending, we determined that the tables related to alerts, alert history, events, state changes, and performance counters increased over time and were related to the number of agents and the retention period for the database. We found that data increases in the Operations database at an average rate of 5MB/day per agent. From this exercise, we determined the following formula to estimate the size of the Operations database:

(5MB/day × Number of Agents × Retention Days) + 510MB = Operations Database

As an example, at 5MB/day for 10 agents with a default retention period of 7 days, the database should have approximately 860MB of space used in the database.

With a growth rate of 5MB/day per agent, the size of the database is determined by the number of days the data is in the Operations database—that is, the grooming interval. Table 4.19 highlights in bold the default grooming interval of 7 days.

Table 4.19. Operations Database Sizes in MB for Various Grooming Intervals and Agents

Grooming Interval (days)

3000 Agents

2000 Agents

1000 Agents

500 Agents

100 Agents

50 Agents

10 Agents

1

15,510

10,510

5510

3010

1010

760

560

2

30,510

20,510

10,510

5510

1510

1010

610

3

45,510

30,510

15,510

8010

2010

1260

660

4

60,510

40,510

20,510

10,510

2510

1510

710

5

75,510

50,510

25,510

13,010

3010

1760

760

6

90,510

60,510

30,510

15,510

3510

2010

810

7

105,510

70,510

35,510

18,010

4010

2260

860

8

120,510

80,510

40,510

20,510

4510

2510

910

9

135,510

90,510

45,510

23,010

5010

2760

960

10

150,510

100,510

50,510

25,510

5510

3010

1010

As you can see from Table 4.19, using the default grooming interval of 7 days, the Operations database will use approximately 35GB of space for 1000 agents.

Figure 4.8 charts the growth of the Operations database for ease of reference. The series lines for each of the agent counts plot the projected size of the Operations database for grooming intervals, varying from 1 to 10 days. You can see the projected growth of the Operations database as the grooming interval is increased and as the number of agents increases.

Operations database size chart.

Figure 4.8. Operations database size chart.

It is important to note there is a third factor we are unable to build into this type of an estimate: management packs. Although a correlation exists between an increased number of management packs and a larger Operations database, the database size depends on not only the actual number of rules in the management pack, but also what those rules do. A single rule could gather the entire Application Event log and record it in the database, in contrast with another rule that only writes to the database when a specific error condition occurred. To consider this factor in the database sizing, we gathered metrics from varying environments with a variety of management packs to identify a growth range covering the majority of deployments. To trend your environment effectively, it is important to conduct appropriate testing and monitor the growth rates of your own installation over time; your growth rate may vary from the estimates we have developed.

Figure 4.9 gives a clear view of how the Operations database is using the data. If you need to limit the growth of your Operations database, remember that performance measurements contain the bulk of the data within this database. Tuning the frequency with which performance measurements are gathered is a good target for reducing database growth.

Operations database data distribution chart.

Figure 4.9. Operations database data distribution chart.

Data Warehouse Sizing

The Data Warehouse database is the long-term data repository for Operations Manager 2007. In MOM 2005, the size of the Reporting database was closely tied to the size of the Operations database—a DTS package tied the two databases together by transferring data from one database to the other. With OpsMgr 2007, this approach has changed.

Management pack objects write data directly to the Data Warehouse database. OpsMgr aggregates rule and performance data for reporting, rather than storing and reporting on raw data. This approach has the potential to provide much longer-term reports for a larger number of agents and to use a smaller Data Warehouse database.

This has interesting results in providing effective sizing estimates for the data warehouse. To determine this, we took the same approach as we did with the Operations database and applied it to the Data Warehouse database.

In reviewing the Data Warehouse database, we found that a fresh installation of Reporting for OpsMgr 2007 with a 1500MB database used a total of 570MB of the total database and log size. We reviewed all the tables in the Data Warehouse database and tracked which ones increase over time.

As with the Operations database, we found that the same data was growing over time: alerts, alert history, events, state changes, and performance counters. The number of agents and the retention period also affect the amount of the increase. We found that data increases in this database at an average rate of 3MB/day per agent. From our exercise, we have determined the following formula to estimate the size of the Data Warehouse database:

(3MB/day × Number of Agents × Retention Days) + 570 MB = Data Warehouse size

As an example, at 3MB/day for 10 agents with a retention period of 30 days, the database should have approximately 1470MB of space used. The Data Warehouse database is set to autogrow, so it will automatically increase in size to accommodate the additional data.

Table 4.20 shows the Data Warehouse database size for various grooming intervals and agent counts.

Table 4.20. Data Warehouse Database Sizes in MB for Various Grooming Intervals and Agents

Retention Period

3000 Agents

2000 Agents

1000 Agents

500 Agents

100 Agents

50 Agents

10 Agents

1 Month

270,570

180,570

90,570

45,570

9570

5070

1470

2 Month

540,570

360,570

180,570

90,570

18,570

9570

2370

1 Qtr

810,570

540,570

270,570

135,570

27,570

14,070

3270

2 Qtr

1,620,570

1,080,570

540,570

270,570

54,570

27,570

5970

3 Qtr

2,430,570

1,620,570

810,570

405,570

81,570

41,070

8670

1 Yr

3,285,570

2,190,570

1,095,570

548,070

110,070

55,320

11,520

5 Qtr

4,095,570

2,730,570

1,365,570

683,070

137,070

68,820

14,220

6 Qtr

4,905,570

3,270,570

1,635,570

818,070

164,070

82,320

16,920

7 Qtr

5,715,570

3,810,570

1,905,570

953,070

191,070

95,820

19,620

2 Yr

6,570,570

4,380,570

2,190,570

1,095,570

219,570

110,070

22,470

As you can see from Table 4.20, with large numbers of agents and a long retention period, this database can quickly go beyond the 1TB or 2TB level; therefore, it is good news that no documented size limits exist on the Data Warehouse database! Figure 4.10 displays a graph showing the impact to the size of the Data Warehouse database depending on the number of agents and how long the data is retained.

Reporting database size chart.

Figure 4.10. Reporting database size chart.

When estimating the size of the data warehouse, it is important to consider the amount of data collected daily and how long you intend to keep that data in the data warehouse, because this will directly affect your storage requirements. Your proof of concept (POC) should provide good figures for projecting the size of your Data Warehouse database.

If limiting the amount of data collected is not an option, there are several potential solutions for managing this volume of data:

  • Extract the data to reports—One approach to resolve this issue is to generate monthly reports of the data and to archive the reports. This provides a method to summarize the information and retain it for long durations of time. Using this approach limits your flexibility because you cannot generate new reports from the data.

  • Reduce the time interval—Although keeping data for a year might be nice, it may be sufficient to have only a quarter or so of reporting database data online to generate reports. If that horizon is sufficient and keeps the database size to where you need it to be, it is an easy solution.

  • Create archive snapshots—Another approach to provide long-term access to reports is to create a snapshot of the database on a regular basis. You can archive and restore these database backups to a temporary database when needed. This technique is further discussed in Chapter 12.

You can implement these potential workarounds to the problem of database volume individually or together. For example, you might create quarterly archive snapshots to review the historical data up to 1 year old and use archived monthly reports to view historical data up to 5 years old.

Tip: Databases Sizing Spreadsheet

The CD accompanying this book includes a databases sizing spreadsheet. The spreadsheet allows you to calculate approximate sizes for the Operations database, Data Warehouse database, and ACS database. The calculations use the sizing information presented in this book.

Management Group Scalability

As your OpsMgr environment increases to large amounts of managed nodes, Operations Manager responds with the capability to scale to those groups. By using multiple management groups, multiple management servers, and event forwarding, you can make almost any project a real possibility.

If you plan to monitor over 100 nodes, consider adding multiple management servers in your management group (as shown in Figure 4.11). After the first two management servers, we recommend adding an additional management server for every increase of 250 to 500 nodes.

Multiple management servers in a management group.

Figure 4.11. Multiple management servers in a management group.

Multiple Management Groups

Expanding your implementation to use multiple management groups, as previously defined, can take place at the designer’s discretion, although implementing multiple management groups generally takes place to segregate different geographic areas with slow WAN links from each other. If the link speeds to remote managed nodes are under 256Kbps, we recommend either creating a new management group or greatly throttling the amount of information collected from remote agents. Determining whether to create a new management group or simply to throttle the events sent will depend on the amount of remote servers. It does not make sense to create a new management group if you have only two small sites with four servers each to monitor.

If, however, you have distributed your remote servers around various offices and they have better connections to each other than to your primary management group, it may be wise to create a new management group. This would typically be the case if you were to create management groups based on continent, for example. Figure 4.12 shows a worldwide OpsMgr 2007 infrastructure with regional management groups (North America, Europe, and Asia-Pacific). The global management group connects to each of the regional management groups to review their alerts.

Management groups by geographic region.

Figure 4.12. Management groups by geographic region.

In this configuration, each region can operate the management groups semi-autonomously, yet still allow a global view of the environment worldwide using a global console.

Through creating and manipulating management groups and management servers, your OpsMgr environment can scale from a handful of monitored servers to large numbers of nodes worldwide. This scalability helps establish Operations Manager as an enterprise monitoring solution.

Tying It All Together

To complete the design piece of this stage, each of the aspects discussed needs to be documented, discussed, revised, and agreed upon (“DDRA” for short). Following this process identifies potential issues with the design before testing it as a proof of concept. It also facilitates communication with other members of your organization and helps drive acceptance of Operations Manager within your company.

Planning

After determining your Operations Manager design, the next step is to plan for the remaining phases, which include Proof of Concept, Pilot, and Implementation. These phases have the same goals: limiting potential risk to your production environment and avoiding “alert overload” for your organization.

Note: Defining Risk

Any time a production environment changes, there is a risk that the change can cause a problem within the environment.

It is important to try to avoid alert overload, which refers not to a technical hardware constraint but to an attention constraint on the part of the people involved with monitoring OpsMgr, who can only handle a limited number of alerts. Alert overload occurs when the number of alerts generated causes the human receivers of these alerts to ignore the errors (or actively oppose using the tool providing the notifications). You don’t want to cry wolf unless there really is a good chance a wolf is at the door. Avoiding risk of alert overload can positively impact the likelihood of a successful OpsMgr deployment. Your planning document should include details on your Proof of Concept, Pilot, and Implementation phases.

Proof of Concept Planning

A proof of concept should emulate the production hardware as closely as possible. Use production hardware for the OpsMgr solution if it is available because that will most closely emulate your production environment. Isolate your POC network configuration from the production environment to allow full testing of servers without affecting their production equivalents. The planning phase identifies what needs testing as part of the POC—base testing on your business requirements.

For example, the following steps may comprise a high-level POC plan for a single-server Operations Manager 2007 configuration:

  1. Creating an isolated network for POC testing.

  2. Installing the domain controllers.

  3. Installing the application servers that will be monitored by OpsMgr.

  4. Installing the OpsMgr server(s). (Install Windows 2003 SP 1 and SQL Server 2005 SP 1 or greater.)

  5. Creating any required OpsMgr service accounts and confirming rights to service accounts.

  6. Installing the Data Warehouse Server and Root Management Server Components.

  7. Installing the OpsMgr reporting components.

  8. Discovering and installing OpsMgr agents.

  9. Installing management packs.

  10. Configuring management packs.

  11. Configuring recipients and notifications.

  12. Configuring the Operations Manager Web console and OpsMgr web reports.

  13. Executing tests defined for the OpsMgr environment.

The actual content of your POC plan will vary depending on your environment, but it is important to plan what steps will occur during POC. This helps you avoid a “mad scramble” of installing systems without a plan, which you can leverage both in the POC stage and in the Pilot stage.

Pilot Planning

A pilot deployment moves the OpsMgr solution that you have created into a production environment. A pilot is by its nature limited in scope. In planning your pilot, you should consider how to limit either the number of servers you will be monitoring or the number of management packs you will deploy, or potentially both. Pilot planning should detail what servers you are deploying OpsMgr to, what servers it will monitor, and the management packs you will utilize. Because the pilot occurs in the production environment, you should be deploying your production OpsMgr hardware.

Implementation Planning

The Implementation phase takes the pilot configuration you created and rolls it to a full production deployment. The order in which you will add servers and management packs should be included as part of your implementation plan.

Proof of Concept

The purpose of a proof of concept is to take the design that you have created, build it, and “kick the tires.” Production is not a POC environment; when you are in a production environment you are past the Proof of Concept stage, and any errors you encounter have much larger costs than if you had caught them in the POC.

Tip: Setting Up a POC

Virtual server environments make it much easier to create an isolated proof of concept environment.

A POC environment is also a great opportunity to try out a variety of management packs and see how they perform and what information they can provide. If possible, retain your POC environment even after moving on to later phases of your OpsMgr deployment. This provides an infrastructure to test additional management packs, management pack updates, and service packs in a non-business-critical environment.

POC Challenges

It can be difficult to emulate a full production environment within the confines of a POC. A primary reason is the hardware used in the POC environment because production-level hardware is not typically accessible. Should the type of hardware used within the POC not reflect the level of hardware in production, you may be unable to assess the speed or full functionality of your solution.

There are also some inherent challenges in any POC environment. How does one effectively scale for production? For example, if you are monitoring logon and logoff events, how do you generate enough events to successfully monitor them? From our perspective, two options are available:

  • Using scripts to generate sample events that can provide a large amount of event data.

  • Using a POC exclusion, which is a document describing items that you could not effectively test within the POC environment. You can use this document during the Pilot phase to determine additional testing that needs to occur as part of the pilot.

Another complexity of POC environments is they often are isolated from the production network, removing potential interaction between the two environments. Network isolation removes the risk of inadvertently affecting production, but also adds complications because production network resources are not available (file shares, patching mechanisms, and so on). If your POC testing is in an isolated environment, you will want to establish a separate Internet connection to be able to patch systems and access non-POC resources.

Establishing an Effective POC

With the challenges inherent within the POC environment, how do you determine what to focus on during your POC? We suggest focusing on two major concepts: basic design validation and complexities within your specific environment.

To validate your design, test the design and determine whether there are inherent issues. This process requires deploying your design and testing OpsMgr’s basic functionality including alerts, events, notifications, and functionality using the OpsMgr consoles. Part of basic design validation testing should also include tuning alerts and documenting your processes. If you are running in an isolated environment, you will also need a domain controller, DNS, and a mail server for email notification. Basic design validation should only require a small percentage of time within your POC.

Spend the majority of the POC time testing the complexities specific to your design or environment. Let’s say your design includes Tivoli or OpenView integration. This represents a complexity to test during the POC; you should deploy the other management software within your POC. Although this sounds like a lot of difficulty, how would you know how the two systems will interact without any testing? The only other option is testing within the production environment, which we obviously do not recommend! (Before you decide to test in production, ask yourself how your boss would respond if your testing caused an outage in your production environment. We doubt he or she would be impressed with your testing methodology.)

Other examples of potential complexities are connected management groups or multihomed environments, highly redundant configurations, third-party management packs, and any requirements to create custom management packs. Your focus during POC testing should directly relate to the business requirements identified for your OpsMgr solution.

POC testing provides a safe method to effectively assess your design and make updates as required based on the results of your tests. Do not be surprised if your design changes based on the results of your POC tests.

Using POC environments also gives you the ability to configure production systems as multihomed, reporting to both production and POC management groups. Utilizing a subsection of types of production systems can provide strong insights for how Operations Manager will function in your production environment. This gives you a method to test changes to management packs, which you can then export and import into production.

Pilot

During the Pilot phase, you deploy your production hardware with Operations Manager 2007 and integrate it into your production environment with a limited scope. In the Pilot phase, you are installing your production OpsMgr hardware and implementing the architecture you have designed.

Although you are deploying your production environment design, you will limit the number of servers to which you are deploying OpsMgr agent, or limit the number of management packs used. The Pilot phase provides a timeframe to identify how various management packs respond to the production systems you are monitoring. Out of the box, Operations Manager will provide a limited number of alerts, but additional changes are often required to “tune” it to your particular environment. Initial tuning of OpsMgr may occur now, but is also limited in scope. During the Pilot phase, you should test any POC exclusions identified within the actual production environment.

During the Pilot phase, track the amount of data gathered in the operations database to determine whether your OpsMgr database has sufficient space on a per-server basis. You can check the amount of free space available on the operations database using Microsoft’s SQL Server Management Studio. The Operations console also provides tracking for both database free space and log free space. To access this functionality, use the menu command Monitoring -> Microsoft SQL Server -> Databases -> Database Free Space and Transaction Log Free Space.

Tip: Tracking MOM Database Size and Growth

The SQL Database Space Report provides the ability to track the percentage of free space for trending purposes. This report is included within the SQL management pack.

Implementation

The Implementation phase moves from pilot into full production deployment. Two major methods are generally used for deploying Operations Manager 2007 during the Implementation phase:

  • Phased deployment—Using this approach, you will add in servers and management packs over time, allowing dedicated time for each server or management pack to tune alerts and events.

    The phased deployment approach takes a significant amount of time, but you will have the benefit of thoroughly understanding each management pack and the effect it has on the servers in your environment. Using a phased approach can also minimize risk.

  • Bulk deployment—The second approach is a bulk deployment of Operations Manager, which limits the notifications to the individual or group doing the OpsMgr deployment.

    If you have deployed all servers and all management packs and your notification groups are thoroughly populated, the resulting flood of alerts may annoy the recipients in the notification groups, and they may just ignore the alerts. The benefit of a bulk deployment with limited notification is that you can deploy the entire OpsMgr environment quickly.

With either deployment approach, time is required to tune Operations Manager for your specific environment. Tuning within OpsMgr is the process of fixing the problems about which OpsMgr is alerting, overriding the alerts for specific servers, or filtering those alerts.

Tip: Tuning Management Packs

Chapter 13, “Administering Management Packs,” covers the basic approaches of tuning. Specific steps involved in implementing each management pack include utilizing the processes discussed in that chapter.

Maintenance

Now that you have implemented Operations Manager, you are finished, right? Not exactly. OpsMgr is a monitoring product and requires maintenance to keep it working effectively within your environment.

Although you can design OpsMgr to provide responses to common error situations, operators and technical specialists should regularly monitor your management system. Part of regular maintenance for Operations Manager involves responding to and addressing alerts, responding to the notifications it provides, and monitoring the Operations console to respond to events and alerts that occur. The tasks included with the various management packs can provide an efficient manner to perform common maintenance tasks.

Like other systems, your OpsMgr server environment will require maintenance through deployments of software patches, service packs, antivirus updates, backups, and other regularly scheduled maintenance.

Another part of the Maintenance phase is maintaining management packs within OpsMgr. Management packs constantly change. Microsoft updates existing management packs as it identifies new features or bug fixes. Microsoft, third-party vendors, and other Operations Manager administrators create new management packs, and you can download these from the Internet. As part of the Maintenance phase, you need to consider additions, updates, or the removal of management packs.

Agents in OpsMgr also require maintenance. As you add new servers to your environment, deploy OpsMgr agents to monitor them. Likewise, as servers become obsolete, you should uninstall their agents. Agent software requires updates as you apply service packs to the management server.

In summary, within the Maintenance phase it is important to monitor, maintain, and update your OpsMgr infrastructure. Operations Manager environments are constantly evolving because the infrastructures they support are continually changing.

Sample Designs

This chapter contains a lot of guidance and information, which can be a bit overwhelming to absorb and translate into a design. However, many OpsMgr implementations fit into the same general guidelines. Taking a closer look at a sample organization and its Operations Manager environment design can give you some clues into your own organization’s design.

We will look at three sample designs to get an idea of how the process works through using System Center Capacity Planner 2007.

Single-Server OpsMgr Design

Eclipse is a 1000-user corporation headquartered in Frisco. A single Windows 2003 domain, eclipse.com, is set up and configured across the enterprise. Company headquarters has 100 Windows Server 2003 servers performing roles as file servers, DHCP servers, Active Directory domain controllers, DNS servers, SQL 2000 and 2005 database servers, and Exchange 2003 messaging servers.

Due to a recent spate of system failures and subsequent downtime, which could have been prevented with better systems management, Eclipse’s IT group looked at Operations Manager 2007 as a solution for providing much-needed systems management for its server environment.

Because Eclipse will be monitoring approximately 100 servers, IT decided to deploy a single management group consisting of an OpsMgr server running all OpsMgr components. The server chosen is a dual-processor P4 2.4GHz server with 2GB of RAM and redundant hardware options.

Eclipse created a single management group for all its servers, and it distributed OpsMgr 2007 agents throughout the server infrastructure. In addition, Eclipse deployed and configured the default management packs for the Base OS, DNS, DHCP, Active Directory, Exchange, and SQL Server.

Figure 4.13 shows the final design.

Eclipse single-server design.

Figure 4.13. Eclipse single-server design.

Eclipse’s OpsMgr administrator feels that the company can easily scale its Operations Manager implementation to higher numbers of servers if it wants because the hardware can accommodate increased numbers of users. Eclipse is mildly concerned with the single point of failure of its single OpsMgr server, but has decided that it is most cost-effective to deploy a simple single-server management group.

Eclipse evaluated the Operations database retention and decided to use a retention period of four days to keep the Operations database size down to approximately 3GB, per Table 4.19. It wants to retain its reporting data for a total of one year, which referring back to Table 4.20 would require approximately 110GB of storage.

Table 4.21 summarizes the design points and decisions.

Table 4.21. Eclipse Single-Server OpsMgr Design Summary

Design Point

Decision

Monitored computers

100

Management group

1

Management group name(s)

GROUP1

Management server(s)

1

Operations database retention

4 days

Estimated Operations database size

3GB

Data Warehouse database retention

1 year

Estimated Data Warehouse database size

110GB

Single-Management Group Design

After using OpsMgr for some amount of time, the Eclipse Corporation decides it needs redundancy and better performance for Operations Manager. The company also goes through a round of acquisitions, which increases the number of managed computers to 500 servers. Eclipse re-evaluated their Operations database retention, deciding the default period of seven days better meets their requirements, giving in a database size of 18GB per Table 4.19. Eclipse still wants to retain its operations data for a total of one year, which requires approximately 550GB of storage when increasing the number of servers to 500 and based on the information in Table 4.20.

The option chosen is a single OpsMgr management group named GROUP1 with a five-server design, including an Operations database server, a Reporting server, a Data Warehouse server, a RMS, and an additional management server. The servers chosen are all dual-processor P4 2.4GHz servers with 2GB of RAM and redundant hardware options. Figure 4.14 shows the diagram.

Eclipse multiserver design.

Figure 4.14. Eclipse multiserver design.

For the storage requirements, the Operations Database Server Component uses a dual-channel controller with a mirrored set of 72GB drives for the OS/logs and a RAID5 set of three 72GB drives for the database. The Reporting/Data Warehouse Server Component uses a dual-channel controller with a mirrored set of 72GB drives for the OS/logs and an external array with a total capacity of 2TB.

The dual–management server configuration allows Eclipse to assign 250 of its managed computers to each of the management servers. In the event of a failover, either management server could handle the load of 500 total agents. This gives Eclipse the fault tolerance it needs.

In the event of a database server outage, the management servers and agents will buffer the operations data. The datacenter stocks standard spare parts and servers to be able to restore operations within 4 hours. The data warehouse will take longer to bring back to full operations, but this database is less mission-critical because its only impact is on report generation capabilities.

Table 4.22 summarizes the design points and decisions.

Table 4.22. Eclipse Single Management Group OpsMgr Design Summary

Design Point

Decision

Monitored computers

500

Management group

1

Management group name(s)

GROUP1

Management Server(s)

4

Operations database retention

7 days

Estimated Operations database size

18GB

Data Warehouse database retention

1 year

Estimated Data Warehouse database size

550GB

Multiple-Management Group Design

Odyssey is a large, 2000-user corporation with two major offices in Plano and Carrollton. The Plano office hosts 500 servers, and the Carrollton location hosts 250. Most of the servers are Windows Server 2003 machines, but a minority is composed of Novell NetWare, Linux, and Windows 2000 machines. Odyssey utilizes a single-domain Windows 2003 Active Directory.

Odyssey needs to deploy a robust server management system to increase its uptime levels and improve productivity across the enterprise. It chose Operations Manager 2007 to accomplish this task.

The early design sessions indicated that two management groups for OpsMgr are required, one for each geographical location. This is due to the independent nature of the two locations. Connected management groups could then be set up between the management groups.

The Plano management group will manage up to 500 servers, and the Carrollton location will manage up to 250 servers. A connected management group (discussed in the “Multiple Management Group Architectures” section of this chapter) is used to integrate the two management groups, and both management groups use a single data warehouse in the Plano management group.

For the storage requirements, the combined database and reporting server uses a dual-channel controller with a mirrored set of 72GB drives for the OS/logs and a RAID5 set of eight 72GB drives for the databases. The total storage for the databases is approximately 500GB.

Odyssey makes extensive use of the Microsoft-provided management packs and purchases add-on management packs from third-party vendors to manage its disparate operating systems and application environments.

This design allows the organization to have a corporate view of all alerts within the entire company but still allows the regions to have full operational control over their servers and utilize the Operations Manager 2007 features to ensure uptime. The strategic placement of components allows for server consolidation and redundancy where needed.

Table 4.23 summarizes the design points and decisions.

Table 4.23. Odyssey Multiple Management Group Design Summary

Design Point

Decision

Monitored computers

750

Management groups

2

Management group name(s)

Plano

 

Carrollton

Management Server(s)

4

Operations database retention

7 days

Estimated Operations database size

11GB, 6GB

Data Warehouse database retention

1 year

Estimated Data Warehouse database size

825GB (550 GB + 275 GB)

Summary

This chapter explained why it is important to develop a plan before implementing OpsMgr into your environment. We discussed assessment and design as a part of that plan, as well as some of the technical design considerations for the various components of an OpsMgr infrastructure. We also discussed the planning phases needed for an effective deployment of Operations Manager 2007 in your organization and looked at some sample designs. The next chapter discusses planning complex OpsMgr configurations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.36.141