Chapter 18 Proactively Monitoring SQL Server 2008 with System Center Operations Manager 2007

System Center Operations Manager (OpsMgr) 2007, also known as SCOM, provides the best-of-breed approach to proactively monitoring and managing a SQL Server 2008 infrastructure. Through the use of monitoring and alerting components, OpsMgr helps to identify specific environmental conditions before they evolve into problems.

OpsMgr provides a timely view of important conditions in SQL Server 2008, as displayed in Figure 18.1, and intelligently links problems to knowledge provided in the monitoring rules. Critical events and known issues are identified and matched to technical reference articles in the Microsoft Knowledge Base for troubleshooting and quick problem resolution.

FIGURE 18.1 Monitoring SQL Server systems with OpsMgr 2007 Console.

image

The monitoring is accomplished using standard operating system components such as Windows Management Instrumentation (WMI), Windows event logs, and Windows performance counters, along with OpsMgr-specific components designed to perform synthetic transactions and track the health and availability of network services such as SQL Server 2008. In addition, OpsMgr provides a reporting feature that allows administrators to track problems and trends occurring on the network. Reports can be generated automatically, providing database administrators, network administrators, managers, and decision makers with a current and long-term historical view of environmental trends in SQL Server.

Note

System Center Operations Manager was originally developed by NetIQ and then purchased and released as Microsoft Operations Manager (MOM) 2000. OpsMgr was subsequently updated and released as MOM 2005. Recently, the product has been completely redesigned and was released as System Center Operations Manager 2007. OpsMgr 2007 contains powerful management capabilities and presents a fundamental change in the way systems are monitored. In addition to individual server monitoring, groups of systems can now be monitored together as a service with multiple interdependent and distributed components.

Overview of System Center Operations Manager

OpsMgr is a sophisticated monitoring system that effectively allows for large-scale management of mission-critical servers. Organizations with a medium to large investment in Microsoft technologies will find that OpsMgr has an unprecedented ability to keep on top of the tens of thousands of event log messages that occur on a daily basis. In its simplest form, OpsMgr performs two functions: processing monitored data and issuing alerts and automatic responses based on that data.

The model-based architecture of OpsMgr presents a fundamental shift in the way a network is monitored. The entire environment can be monitored as groups of hierarchical services with interdependent components. Microsoft, in addition to third-party vendors and a large development community, can leverage the functionality of OpsMgr components through customizable monitoring rules.

OpsMgr provides for several major pieces of functionality as follows:

Image   Management packs— Application-specific monitoring rules are provided within individual files called management packs. For example, Microsoft provides management packs for Windows server systems, SQL Server, Exchange, SharePoint, DNS, and DHCP, along with many other Microsoft technologies. Management packs are loaded with the intelligence and information necessary to properly troubleshoot and identify problems. The rules are dynamically applied to agents based on a custom discovery process provided within the management pack. Only applicable rules are applied to each managed server.

Image   Event monitoring rules— Management pack rules can monitor for specific event log data. This is one of the key methods of responding to conditions within the environment.

Image   Performance monitoring rules— Management pack rules can monitor for specific performance counters. This data is used for alerting based on thresholds or archived for trending and capacity planning.

Image   State-based monitors— Management packs contain monitors, which allow for advanced state-based monitoring and aggregated health rollup of services. Monitors also provide self-tuning performance threshold monitoring based on a two- or three-state configuration.

Image   Alerting— OpsMgr provides advanced alerting functionality by enabling email alerts, paging, short message service (SMS), instant messaging (IM), and functional alerting roles to be defined. Alerts are highly customizable, with the ability to define alert rules for all monitored components.

Image   Reporting— Monitoring rules can be configured to send monitored data to both the operations database for alerting and the reporting database for archiving.

Image   End-to-end service monitoring— OpsMgr provides service-oriented monitoring based on System Definition Model (SDM) technologies. This includes advanced object discovery and hierarchical monitoring of systems.

Processing Operational Data

OpsMgr proactively manages and monitors Windows networks including a SQL Server infrastructure through monitoring rules used for object discovery, Windows event log monitoring, performance data gathering, and application-specific synthetic transactions. Monitoring rules define how OpsMgr collects, handles, and responds to the information gathered. OpsMgr monitoring rules handle incoming event data and allow OpsMgr to react automatically, either to respond to a predetermined problem scenario, such as a failed hard drive, with predefined corrective and diagnostic actions (for example, trigger an alert, execute a command or script) to provide the operator with additional details based on what was happening at the time the condition occurred.

Generating Alerts and Responses

OpsMgr monitoring rules can generate alerts based on critical events, synthetic transactions, or performance thresholds and variances found through self-tuning performance trending. An alert can be generated by a single event or by a combination of events or performance thresholds. Alerts can also be configured to trigger responses such as email, pages, Simple Network Management Protocol (SNMP) traps, and scripts to notify you of potential problems. In brief, OpsMgr is completely customizable in this respect and can be modified to fit most alert requirements.

Outlining OpsMgr Architecture

OpsMgr is primarily composed of five basic components: the operations database, reporting database, Root Management Server, management agents, and Operations Console. These components make up a basic deployment scenario. Several optional components are also described in the following bulleted list; these components provide functionality for advanced deployment scenarios.

OpsMgr was specifically designed to be scalable and can be configured to meet the needs of any size company. This flexibility stems from the fact that all OpsMgr components can either reside on one server or can be distributed across multiple servers.

Each of these various components provides specific OpsMgr functionality. OpsMgr design scenarios often involve the separation of parts of these components onto multiple servers. For example, the database components can be delegated to a dedicated server, and the management server can reside on a second server.

The following list describes the different OpsMgr components:

Image   Operations database— The operations database stores the monitoring rules and the active data collected from monitored systems. This database has a 7-day default retention period.

Image   Reporting database— The reporting database stores archived data for reporting purposes. This database has a 400-day default retention period.

Image   Root Management Server— This is the first management server in the management group. This server runs the software development kit (SDK) and Configuration service and is responsible for handling console communication, calculating the health of the environment, and determining what rules should be applied to each agent.

Image   Management server— Optionally, an additional management server can be added for redundancy and scalability. Agents communicate with the management server to deliver operational data and pull down new monitoring rules.

Image   Management agents— Agents are installed on each managed system to provide efficient monitoring of local components. Almost all communication is initiated from the agent with the exception of the actual agent installation and specific tasks run from the Operations Console. Agentless monitoring is also available with a reduction of functionality and environmental scalability.

Image   Operations Console— The Operations Console is used to monitor systems, run tasks, configure environmental settings, set author rules, subscribe to alerts, and generate and subscribe to reports.

Image   Web console— The Web console is an optional component used to monitor systems, run tasks, and manage maintenance mode from a web browser.

Image   Audit Collection Services— This is an optional component used to collect security events from managed systems; this component is composed of a forwarder on the agent that sends all security events, a collector on the management server that receives events from managed systems, and a special database used to store the collected security data for auditing, reporting, and forensic analysis.

Image   Gateway server— This optional component provides mutual authentication through certificates for nontrusted systems in remote domains or workgroups.

Image   Command shell— This optional component is built on PowerShell and provides full command-line management of the OpsMgr environment.

Image   Agentless Exception Monitoring— This component can be used to monitor Windows and application crash data throughout the environment and provides insight into the health of the productivity applications across workstations and servers.

Image   Connector Framework— This optional component provides a bidirectional web service for communicating, extending, and integrating the environment with third-party or custom systems.

Understanding How OpsMgr Stores Captured Data

OpsMgr itself utilizes two Microsoft SQL Server databases for all collected data. Both databases are automatically maintained through OpsMgr-specific scheduled maintenance tasks.

The operations database stores all the monitoring rules and is imported by management packs and operational data collected from each monitored system. Data in this database is retained for 7 days by default. Data retention for the operations database is lower than the reporting database to improve efficiency of the environment. This database must be installed as a separate component from OpsMgr but can physically reside on the same server, if needed. The reporting database stores data for long-term trend analysis and is designed to grow much larger than the operations database. Data in the reporting database is stored in three states: raw data, hourly summary, and daily summary. The raw data is only stored for 14 days, whereas both daily and hourly data are stored for 400 days. This automatic summarization of data allows for reports that span days or months to be generated very quickly.

Determining the Role of Agents in System Monitoring

The agents are the monitoring components installed on each managed computer. They monitor the system based on the rules and business logic defined in each of the management packs. Management packs are dynamically applied to agents based on the various discovery rules included with each management pack.

Defining Management Groups

OpsMgr utilizes the concept of management groups to logically separate geographical and organizational boundaries. Management groups allow you to scale the size of OpsMgr architecture or politically organize the administration of OpsMgr.

At a minimum, each management group consists of the following components:

Image   An operations database

Image   An optional reporting database

Image   A Root Management Server

Image   Management agents

OpsMgr can be scaled to meet the needs of different sized organizations. For small organizations, all the OpsMgr components can be installed on one server with a single management group. In large organizations, on the other hand, the distribution of OpsMgr components to separate servers allows the organizations to customize and scale their OpsMgr architecture. Multiple management groups provide load balancing and fault tolerance within the OpsMgr infrastructure. Organizations can set up multiple management servers at strategic locations, to distribute the workload among them.

Note

The general rule of thumb with management groups is to start with a single management group and add more management groups only if they are absolutely necessary. Administrative overhead is reduced, and there is less need to re-create rules and perform other redundant tasks with fewer management groups.

Understanding How to Use OpsMgr

Using OpsMgr is relatively straightforward. The OpsMgr monitoring environment can be accessed through three sets of consoles: an Operations Console, a Web console, and a command shell. The Operations Console provides full monitoring of agent systems and administration of the OpsMgr environment, whereas the Web console provides access only to the monitoring functionality. The command shell provides command-line access to administer the OpsMgr environment.

Managing and Monitoring with OpsMgr

As mentioned in the preceding section, two methods are provided to configure and view OpsMgr settings. The first approach is through the Operations Console and the second is through the command shell.

In the Administration section of the Operations Console, you can easily configure the security roles, notifications, and configuration settings. In the Monitoring section of the Operations Console, you can easily monitor a quick “up/down” status, active and closed alerts, and overall environment health.

In addition, a web-based monitoring console can be run on any system that supports Microsoft Internet Explorer 6.0 or higher. This console can be used to view the health of systems, view and respond to alerts, view events, view performance graphs, run tasks, and manage maintenance mode of monitored objects.

Reporting from OpsMgr

OpsMgr management packs commonly include a variety of preconfigured reports to show information about the operating system or the specific application, such as SQL Server, they were designed to work with. The reports provide an effective view of systems and services on the network over a custom period, such as weekly, monthly, or quarterly. They can also help you monitor your networks based on performance data, which can include critical pattern analysis, trend analysis, capacity planning, and security auditing. Reports also provide availability statistics for distributed applications, servers, and specific components within a server.

The reports can be run on demand or at scheduled times. OpsMgr can also generate HTML-based reports that can be published to a web server and viewed from any web browser. Vendors can also create additional reports as part of their management packs.

Using Performance Monitoring

Another key feature of OpsMgr is the capability to monitor and track server performance. OpsMgr can be configured to monitor key performance thresholds through rules that are set to collect predefined performance data, such as memory and CPU usage over time. Rules can be configured to trigger alerts and actions when specified performance thresholds have been met or exceeded, allowing network administrators to act on potential performance issues. Performance data can be viewed from the OpsMgr Operations Console.

In addition, performance monitors can establish baselines for the environment and then alert the administrator when the counter subsequently falls outside the defined baseline envelope. Performance Monitoring with Operations Manager works in conjunction with SQL Server 2008’s Performance Studio. In essence, SQL Server Performance Studio can write performance data to the local Windows Server 2008 application and security logs. Operations Manager can then comb these logs and centralize data into a central warehouse for further analysis and reporting.

Using Active Directory Integration

Active Directory integration provides a way to install management agents on systems without environment-specific settings. When the agent starts, the correct environmental settings, such as the primary and failover management servers, are stored in Active Directory. The configuration of Active Directory integration provides advanced search and filter capabilities to fine-tune the dynamic assignment of systems.

Integrating OpsMgr with Non-Windows Devices

Network management is not a new concept. Simple management of various network nodes has been handled for quite some time through the use of the SNMP. Quite often, simple or even complex systems that utilize SNMP to provide for system monitoring are in place in an organization to provide for varying degrees of system management on a network.

OpsMgr can be configured to integrate with non-Windows systems through monitoring of syslog information, log file data, and SNMP traps. OpsMgr can also monitor TCP port communication and website transaction sequencing for information-specific data management.

Special connectors can be created to provide bidirectional information flows to other management products. OpsMgr can monitor SNMP traps from SNMP-supported devices as well as generate SNMP traps to be delivered to third-party network management infrastructures.

Integrating OpsMgr with Legacy Management Software

Network management is not a new concept. Simple management of various network nodes has been handled for quite some time through the use of SNMP. Quite often, simple or even complex systems that utilize SNMP to provide for system monitoring are in place in an organization to provide for varying degrees of system management on a network.

OpsMgr can be configured to integrate with these network systems and management infrastructures. Special connectors can be created to provide bidirectional information flows to other management products. OpsMgr can monitor SNMP traps from SNMP-supported devices as well as generate SNMP traps to be delivered to third-party network management infrastructures. In addition, OpsMgr can also monitor live events on Unix systems using the syslog protocol.

Recently the OpsMgr team has released new extensions for Cross Platform monitoring and management. Systems that can be monitored include; including HP-UX, Sun Solaris, Red Hat Enterprise Linux, and Novell SUSE Linux Enterprise. Currently this technology is still in Beta.

Exploring Third-Party Management Packs

Software and hardware developers can subsequently create their own management packs to extend OpsMgr’s management capabilities. These management packs extend OpsMgr’s management capabilities beyond Microsoft-specific applications. Each management pack is designed to contain a set of rules and product knowledge required to support its respective products. Currently, management packs have been developed for APC, Cisco, Citrix, Dell, F5, HP, IBM, Linux, Oracle, Solaris, UNIX, and VMware, to name a few. A complete list of management packs can be found at the following Microsoft site:

http://www.microsoft.com/technet/prodtechnol/mom/catalog/catalog.aspx

Understanding OpsMgr Component Requirements

Each OpsMgr component has specific design requirements, and a good knowledge of these factors is required before beginning the design of OpsMgr. Hardware and software requirements must be taken into account, as well as factors involving specific OpsMgr components, such as the Root Management Server, gateway servers, service accounts, mutual authentication, and backup requirements.

Exploring Hardware Requirements

Having the proper hardware for OpsMgr to operate on is a critical component of OpsMgr functionality, reliability, and overall performance. Nothing is worse than overloading a brand-new server only a few short months after its implementation. The industry standard generally holds that any production servers deployed should remain relevant for three to four years following deployment. Stretching beyond this time frame might be possible, but the ugly truth is that hardware investments are typically short term and need to be replaced often to ensure relevance. Buying a less expensive server might save money in the short term but could potentially increase costs associated with downtime, troubleshooting, and administration. That said, the following are the Microsoft-recommended minimum requirements for any server running an OpsMgr 2007 server component:

Image   1.8Ghz+ Pentium or compatible processor

Image   20GB of free disk space

Image   2GB of random access memory (RAM)

These recommendations apply only to the smallest OpsMgr deployments and should be seen as minimum levels for OpsMgr hardware. Future expansion and relevance of hardware should be taken into account when sizing servers for OpsMgr deployment.

Determining Software Requirements

OpsMgr components can be installed on either 32-bit or 64-bit versions of Windows Server 2008, Windows Server 2003 R2, or Windows Server 2003 SP1. The database for OpsMgr must be run on a Microsoft SQL Server 2005 (Standard or Enterprise SP1 or above) server. The database can be installed on the same server as OpsMgr or on a separate server, a concept that is discussed in more detail in following sections.

OpsMgr itself must be installed on a member server in a Windows Active Directory domain. It is commonly recommended to keep the installation of OpsMgr on a separate server or set of dedicated member servers that do not run any other applications that could interfere in the monitoring and alerting process.

A few other factors critical to the success of an OpsMgr implementation are as follows:

Image   DNS must be installed to utilize required mutual authentication between domain members and management servers.

Image   Microsoft .NET Framework 2.0 and 3.0 must be installed on the management server and the reporting server.

Image   Client certificates must be installed in environments to facilitate mutual authentication between nondomain members and management servers.

Image   SQL Reporting Services must be installed for an organization to be able to view and produce custom reports using OpsMgr’s reporting feature.

OpsMgr Backup Considerations

The most critical piece of OpsMgr, the SQL databases, should be backed up regularly using a standard backup software that can effectively perform online backups of SQL databases. If integrating these specialized backup utilities into an OpsMgr deployment is not possible, it becomes necessary to leverage built-in backup functionality found in SQL Server, such as the SQL Server backup utility included in SQL Server Management Studio.

Deploying OpsMgr Agents

OpsMgr agents are deployed to all managed servers through the OpsMgr existing deployment functionality, or by using software distribution mechanisms such as Active Directory GPOs or System Center Configuration Manager 2007. Installation through the Operations Console uses the fully qualified domain name (FQDN) of the computer. When searching for systems through the Operations Console, you can use wildcards to locate a broad range of computers for agent installation. Certain situations, such as monitoring across firewalls, can require the manual installation of these components.

Understanding Advanced OpsMgr Concepts

OpsMgr’s simple installation and relative ease of use often betray the potential complexity of its underlying components. This complexity can be managed with the right amount of knowledge of some of the advanced concepts of OpsMgr design and implementation.

Understanding OpsMgr Deployment Scenarios

As previously mentioned, OpsMgr components can be divided across multiple servers to distribute load and ensure balanced functionality. This separation allows OpsMgr servers to come in four potential “flavors,” depending on the OpsMgr components held by those servers. The four OpsMgr server types are as follows:

Image   Operations database server— An operations database server is simply a member server with SQL Server 2005 and above installed for the OpsMgr operations database. No other OpsMgr components are installed on this server. The SQL Server component can be installed with default options and with the system account used for authentication. Data in this database is kept for 4 days by default.

Image   Reporting database server— A reporting database server is simply a member server with SQL Server 2005 and above and SQL Server Reporting Services installed. This database stores data collected through the monitoring rules for a much longer period than the operations database and is used for reporting and trend analysis. This database requires significantly more drive space than the operations database server. Data in this database is kept for 13 months by default.

Image   Management server— A management server is the communication point for both management consoles and agents. Effectively, a management server does not have a database and is often used in large OpsMgr implementations that have a dedicated database server. Often, in these configurations, multiple management servers are used in a single management group to provide for scalability and to address multiple managed nodes.

Image   All-in-one server— An all-in-one server is effectively an OpsMgr server that holds all OpsMgr roles, including that of the databases. Subsequently, single-server OpsMgr configurations use one server for all OpsMgr operations.

Multiple Configuration Groups

As previously defined, an OpsMgr management group is a logical grouping of monitored servers that are managed by a single OpsMgr SQL database, one or more management servers, and a unique management group name. Each management group established operates completely separately from other management groups, although they can be configured in a hierarchical structure with a top-level management group able to see “connected” lower-level management groups.

The concept of connected management groups allows OpsMgr to scale beyond artificial boundaries and also gives a great deal of flexibility when combining OpsMgr environments. However, certain caveats must be taken into account. Because each management group is an island in itself, each must subsequently be manually configured with individual settings. In environments with a large number of customized rules, for example, such manual configuration would create a great deal of redundant work in the creation, administration, and troubleshooting of multiple management groups.

Deploying Geographic-Based Configuration Groups

Based on the factors outlined in the preceding section, it is preferable to deploy OpsMgr in a single management group. However, in some situations an organization needs to divide its OpsMgr environment into multiple management groups. The most common reason for division of OpsMgr management groups is division along geographic lines. In situations in which wide area network (WAN) links are saturated or unreliable, it might be wise to separate large “islands” of WAN connectivity into separate management groups.

Simply being separated across slow WAN links is not enough reason to warrant a separate management group, however. For example, small sites with few servers would not warrant the creation of a separate OpsMgr management group, with the associated hardware, software, and administrative costs. However, if many servers exist in a distributed, generally well-connected geographical area, that might be a case for the creation of a management group. For example, an organization could be divided into several sites across the United States but decide to divide the OpsMgr environment into separate management groups for East Coast and West Coast, to roughly approximate their WAN infrastructure.

Smaller sites that are not well connected but are not large enough to warrant their own management group should have their event monitoring throttled to avoid being sent across the WAN during peak usage times. The downside to this approach, however, is that the reaction time to critical event response is increased.

Deploying Political or Security-Based Configuration Groups

The less common method of dividing OpsMgr management groups is by political or security lines. For example, it might become necessary to separate financial servers into a separate management group to maintain the security of the finance environment and allow for a separate set of administrators.

Politically, if administration is not centralized within an organization, management groups can be established to separate OpsMgr management into separate spheres of control. This would keep each OpsMgr management zone under separate security models.

As previously mentioned, a single management group is the most efficient OpsMgr environment and provides for the least amount of redundant setup, administration, and troubleshooting work. Consequently, artificial OpsMgr division along political or security lines should be avoided, if possible.

Sizing the OpsMgr Database

Depending on several factors, such as the type of data collected, the length of time that collected data will be kept, or the amount of database grooming that is scheduled, the size of the OpsMgr database will grow or shrink accordingly.

It is important to monitor the size of the database to ensure that it does not increase well beyond the bounds of acceptable size. OpsMgr can be configured to monitor itself, supplying advance notice of database problems and capacity thresholds. This type of strategy is highly recommended because OpsMgr could easily collect event information faster than it could get rid of it.

The size of the operations database can be estimated through the following formula:

Number of agents x 5MB x retention days +
image 1024 overhead = estimated database size

For example, an OpsMgr environment monitoring 1,000 servers with the default 7-day retention period will have an estimated 35GB operations database.

(1000 * 5 * 7) + 1024 = 36024 MB

The size of the reporting database can be estimated through the following formula:

Number of agents x 3MB x retention days +
image1024 overhead = estimated database size

The same environment monitoring 1,000 servers with the default 400-day retention period will have an estimated 1.1TB reporting database.

(1000 * 3 * 400) + 1024 = 1201024 MB

Defining Capacity Limits

As with any system, OpsMgr includes some hard limits that should be taken into account before deployment begins. Surpassing these limits could be cause for the creation of new management groups and should subsequently be included in a design plan. These limits are as follows:

Image   Operations database— OpsMgr operates through a principle of centralized, rather than distributed, collection of data. All event logs, performance counters, and alerts are sent to a single centralized database, and subsequently there can only be a single operations database per management group. The use of a backup and high-availability strategy for the OpsMgr database is, therefore, highly recommended, to protect it from outage. It is recommended to keep this database with a 50GB limit to improve efficiency and reduce alert latency.

Image   Management servers— OpsMgr does not have a hard-coded limit of management servers per management group. However, it is recommended to keep the environment between three to five management servers. Each management server can support approximately 2,000 managed agents.

Image   Gateway servers— OpsMgr does not have a hard-coded limit of gateway servers per management group. However, it is recommended to deploy a gateway server for every 200 nontrusted domain members.

Image   Agents— Each management server can theoretically support up to 2,000 monitored agents. In most configurations, however, it is wise to limit the number of agents per management server, although the levels can be scaled upward with more robust hardware, if necessary.

Image   Administrative consoles— OpsMgr does not limit the number of instances of the Web and Operations Consoles; however, going beyond the suggested limit might introduce performance and scalability problems.

Defining System Redundancy

In addition to the scalability built in to OpsMgr, redundancy is built in to the components of the environment. Proper knowledge of how to deploy OpsMgr redundancy and place OpsMgr components correctly is important to the understanding of OpsMgr redundancy.

Having multiple management servers deployed across a management group allows an environment to achieve a certain level of redundancy. If a single management server experiences downtime, another management server within the management group will take over the responsibilities for the monitored servers in the environment. For this reason, it might be wise to include multiple management servers in an environment to achieve a certain level of redundancy if high uptime is a priority.

The first management server in the management group is called the Root Management Server. Only one Root Management Server can exist in a management group, and it hosts the software development kit (SDK) and Configuration service. All OpsMgr consoles communicate with the management server, so its availability is critical. In large-scale environments, the Root Management Server should leverage Microsoft Clustering technology to provide high availability for this component.

Because there can be only a single OpsMgr database per management group, the database is subsequently a single point of failure and should be protected from downtime. Utilizing Windows Server 2008 clustering or third-party fault-tolerance solutions for SQL databases helps to mitigate the risk involved with the OpsMgr database.

Securing OpsMgr

Security has evolved into a primary concern that can no longer be taken for granted. The inherent security in Windows 2008 is only as good as the services that have access to it; therefore, it is wise to perform a security audit of all systems that access information from servers. This concept holds true for management systems as well because they collect sensitive information from every server in an enterprise. This includes potentially sensitive event logs that could be used to compromise a system. Consequently, securing the OpsMgr infrastructure should not be taken lightly.

Securing OpsMgr Agents

Each server that contains an OpsMgr agent and forwards events to management servers has specific security requirements. Server-level security should be established and should include provisions for OpsMgr data collection. All traffic between OpsMgr components, such as the agents, management servers, and database, is encrypted automatically for security, so the traffic is inherently secured.

In addition, environments with high security requirements should investigate the use of encryption technologies such as IPSec to scramble the event IDs that are sent between agents and OpsMgr servers, to protect against eavesdropping of OpsMgr packets.

OpsMgr uses mutual authentication between agents and management servers. This means that the agent must reside in the same forest as the management server. If the agent is located in a different forest or workgroup, client certificates can be used to establish mutual authentication. If an entire nontrusted domain must be monitored, the gateway server can be installed in the nontrusted domain, agents can establish mutual authentication to the gateway server, and certificates on the gateway and management server are used to establish mutual authentication. In this scenario, you can avoid needing to place a certificate on each nontrusted domain member.

Understanding Firewall Requirements

OpsMgr servers that are deployed across a firewall have special considerations that must be taken into account. Port 5723, the default port for OpsMgr communications, must specifically be opened on a firewall to allow OpsMgr to communicate across it. The following describes communication for other OpsMgr components:

Image   Operations Console to RMS—TCP 5724

Image   Operations Console to Reporting Server—TCP 80

Image   Web console to Web console server—TCP 51908, 445

Image   Agent to Root Management Server—TCP 5723

Image   ACS forwarder to ACS collector—TCP 51909

Image   Agentless management—Remote Procedure Call (RPC)

Image   Management server to databases—OLEDB TCP 1433

Outlining Service Account Security

In addition to the aforementioned security measures, security of an OpsMgr environment can be strengthened by the addition of multiple service accounts to handle the different OpsMgr components. For example, the Management Server Action account and the SDK/Configuration service account should be configured to use separate credentials, to provide an extra layer of protection in the event that one account is compromised.

Image   Management Server Action account— The account responsible for collecting data and running responses from management servers.

Image   SDK and Configuration service account— The account that writes data to the operations database; this service is also used for all console communication.

Image   Local Administrator account— The account used during the agent push installation process. To install the agent, local administrative rights are required.

Image   Agent Action account— The credentials the agent will run as. This account can run under a built-in system account, such as Local System, or a limited domain user account for high-security environments.

Image   Data Warehouse Write Action account— The account used by the management server to write data to the reporting data warehouse.

Image   Data Warehouse Reader account— The account used to read data from the data warehouse when reports are executed.

Image   Run As accounts— The specific accounts used by management packs to facilitate monitoring. These accounts must be manually created and delegated specific rights as defined in the management pack documentation. These accounts are then assigned as run-as accounts used by the management pack to achieve a high-degree of security and flexibility when monitoring the environment.

Exploring the SQL Server Management Pack

When imported, the SQL Server management pack automatically discovers the following objects on managed SQL Server systems in the management group:

Image   SQL Server 2008 Database Engine

Image   SQL Server 2008 Analysis Services

Image   SQL Server 2008 Reporting Services

Image   SQL Server 2008 Integration Services

Image   SQL Server 2008 Distributor

Image   SQL Server 2008 Publisher

Image   SQL Server 2008 Subscriber

Image   SQL Server 2008 DB

Image   SQL Server 2008 Agent

Image   SQL Server 2008 Agent Jobs

Image   SQL Server 2008 DB File Group

Image   SQL Server 2008 DB File

Image   SQL Server 2008 Transaction Log File

As you can see, OpsMgr finds many of the components associated with a SQL Server and not just the server itself. Availability statistics of each component can be calculated independently or together as a group. For example, an availability report can be scheduled for a single database on a server or the entire server. This type of discovery also allows each component to be placed into maintenance mode independently of other components on the server. For example, a single database can be placed into maintenance mode to prevent alerts from being generated when the database is worked on or repaired while other databases on the server are still being monitored.

In addition to basic monitoring of SQL Server—related events and performance data, the SQL Server management pack provides advanced monitoring through custom scripts associated with rules in the management pack. The following rules are specific to SQL Server monitoring. Each rule can be customized for the environment or even a specific server being monitored.

Image   Block Analysis— When an SPID is blocked for more than one minute, an alert is generated. This detection can be configured through the Blocking SPIDs monitor associated with the SQL 2008 DB Engine object. Alert details include; blocked SPID, blocked by SPID, program name, block duration, login name, database name and resources.

Image   Database Configuration— SQL Server—specific configurable options such as Auto Close, Auto Create Statistics, Auto Shrink, Auto Update, DB Chaining, and Torn Page Detection. This detection can be configured through the corresponding configuration monitors associated with the SQL 2008 DB object.

Image   Database Health— Tracks the availability and current state of databases on SQL Servers in the environment. This detection can be configured through the Database Status monitor associated with the SQL Server 2008 DB object.

Image   Database and Disk Space— The free space within database and transaction logs is monitored. An alert is an event generated when predefined thresholds are exceeded or a significant change in size is detected. This detection can be configured through the corresponding performance monitors associated with the SQL Server 2008 DB object.

Image   Replication Monitoring— The whole SQL Server replication topology is monitored indicating overall health and alerts based on replication failures.

Image   Backups— Monitoring of all backup items such as failed and successful backups are captured and presented.

Image   Jobs— Agent jobs that run for more than 60 minutes will generate an alert by default. This detection can be configured through the Long Running Jobs performance monitor associated with the SQL 2008 Agent object. Other jobs and associated items such as failed SQL Server Agent jobs, job corruption and SQL Server Mail are also monitored and alerted upon.

Image   Security Monitoring— Tracks security and audit events such as; license compliance, shutdowns, configuration issues, collection of audit data, denied administration functions, and both successful and failed logons.

Image   Service Pack Compliance— The current service pack level can be monitored by configuring the Service Pack Compliance configuration monitor associated with the SQL Server 2008 DB Engine object. An alert is generated when a server is not at the required service pack level.

Within the Monitoring area of the Operators console, the following views are available to assist with monitoring the environment:

Image   Alerts view

Image   Computers View

Image   Database Free Space Performance

Image   Transaction Log Free Space Performance

Image   Database State

Image   Agent Health State

Image   Database Engine Health State

Image   Analysis Services State

Image   Database Engines State

Image   Integration Services State

Image   Reporting Services State

Image   Database Mirroring State

Image   Server Resource Utilization

Image   SQL Agent Job State

Image   SQL Agent State

The SQL Server management pack also includes several default reports to help with trend-specific SQL:

Image   SQL Broker Performance

Image   SQL Server Database Counters

Image   SQL Server Configuration

Image   SQL Server Lock Analysis

Image   SQL Server Service Pack

Image   SQL User Activity

Image   Top Five Deadlocked Databases

Image   User Connections by Day

Image   User Connections by Peak Hours

Image   SQL Database Space Report

The latest version of management packs should always be used because it includes many improvements and updates from the release code.

Downloading and Extracting the SQL Server 2008 Management Pack

As previously mentioned, management packs contain intelligence about specific applications and services and include troubleshooting information specific to those services. The SQL Server 2008 Management Pack is required for effectively proactively monitoring a SQL Server 2008 infrastructure.

To install the SQL 2008 Management Pack on an OpsMgr management server, first download it from the Microsoft downloads page at www.microsoft.com/technet/prodtechnol/mom/catalog/catalog.aspx?vs=2007.

To install the SQL Server 2008 Management Pack on the OpsMgr management server, follow these steps:

1.   Double-click on the downloaded executable.

2.   Select I Agree on the license agreement page and click Next to continue.

3.   Select a location to which to extract the management pack and then click Next.

4.   Click Next again to start the installation.

5.   Click Close when the file extraction is complete.

Importing the SQL Server 2008 Management Pack File into OpsMgr 2007

After extracting the management pack, follow these steps to upload the management pack files directly into the OpsMgr administrator console:

1.   From the OpsMgr Console, navigate to the Administration node.

2.   Click the Import Management Packs link.

3.   From the Select Management Packs to Import dialog box, browse to the location where the files were extracted and select all of them. Click Open.

4.   From the Import Management Packs dialog box, shown in Figure 18.2, click Import.

FIGURE 18.2 Beginning the SQL Management Pack import process.

image

5.   Click Close when finished.

Note

When managing a Windows infrastructure, it is a best practice not to download only the SQL Server Management Pack. Other management packs that should be downloaded and installed include: Windows Server 2003/2008 Base Operating System Management Pack and the Windows Server 2003/ 2008 Active Directory Management Pack.

Installing the OpsMgr Agent on the SQL Server

Installation of OpsMgr agents on SQL Server can be automated from the OpsMgr console. To initiate the process of installing agents, follow these steps:

1.   From the OpsMgr 2007 Console, click the Monitoring node.

2.   Click the Required: Configure Computers and Devices to Manage link.

3.   From the Computer and Device Management Wizard, shown in Figure 18.3, select Next to start the process of deploying agents.

FIGURE 18.3 Deploying agents to SQL Servers.

image

4.   In the Auto or Advanced dialog box, select Automatic Computer Discovery or experiment by doing a selective search. Note that Automatic Computer Discovery can take a while and have a network impact. Click Next to continue.

5.   Enter a service account to perform the search; it must have local admin rights on the boxes where the agents will be installed. You can also select to use the Action account. Click Discover to continue.

6.   After discovery, a list of discovered servers is displayed, as shown in Figure 18.4. Check the boxes next to the servers where the agents will be installed and click Next.

FIGURE 18.4 Selecting servers to deploy the agents to.

image

7.   On the summary page, leave the defaults and click Finish.

8.   Click Close when complete.

After completing the installation, you might need to wait a few minutes before the information from the agents is sent to the console.

Monitoring SQL Functionality and Performance with OpsMgr

After the management pack is installed for SQL Server and the agent has been installed and is communicating, OpsMgr consolidates and reacts to every event and performance counter sent to it from the SQL Server. This information is reflected in the OpsMgr operations console, as shown in Figure 18.5.

FIGURE 18.5 Monitoring SQL functionality in the OpsMgr 2007 console.

image

For more information on OpsMgr 2007, see the Microsoft website at www.microsoft.com/opsmgr.

Summary

The built-in monitoring tools provide a limited amount of proactive monitoring by allowing you to configure events as necessary to alert operators. Built-in monitoring tools also provide a historical analysis through logs, greatly assisting the troubleshooting process.

System Center Operations Manager 2007 is an ideal monitoring and management platform for a SQL Server farm and has proven its value in proactively identifying potential server issues before they degrade into server downtime. OpsMgr for SQL Server provides the built-in reliability of the OS and allows for greater control over a large, distributed server environment. In addition, proper understanding of OpsMgr components, their logical design and configuration, and other OpsMgr placement issues can help an organization to fully realize the advantages that OpsMgr can bring to a SQL Server 2008 environment.

Best Practices

The following are best practices from this chapter:

Image   Examine the use of System Center Operations Manager 2007 for monitoring SQL Servers.

Image   Install the updated SQL 2008 Management Pack into the OpsMgr management group.

Image   Take future expansion and relevance of hardware into account when sizing servers for OpsMgr deployment.

Image   Keep the installation of OpsMgr on a separate server or set of separate dedicated member servers that do not run any other separate applications.

Image   Use SQL Server Reporting Services to produce custom reports using OpsMgr’s reporting feature.

Image   Start with a single management group and add additional management groups only if they are absolutely necessary.

Image   Use a dedicated service account for OpsMgr.

Image   Monitor the size of the OpsMgr database to ensure that it does not increase beyond the bounds of acceptable size.

Image   Archive collected data.

Image   Modify the grooming interval to aggressively address space limitations and keep the database consistent.

Image   Configure OpsMgr to monitor itself.

Image   Satisfy regulatory compliance by leveraging OpsMgr’s Audit Collection Services (ACS) for centralizing and auditing SQL Server events.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.240.252