A book on Configuration Manager troubleshooting would not be complete without reference to Disaster Recovery (DR). What do you do when all else fails? Many administrators will never have to recover Configuration Manager in this way. However, they must be prepared for the worst-case scenario. It's a very important aspect of our job.
So what is DR? In simple terms, it is the ability to recover a service from catastrophic failure in the least possible time with minimal data loss. A Disaster Recovery Plan (DRP), sometimes known as a Business Continuity Plan, documents the procedures and policies required to recover services. You (or one of your team) are responsible for the Configuration Manager DRP.
So what has to be done? What does DR mean in relation to Configuration Manager? Infrastructures vary across organizations. Some have large environments with a Central Administration Site and several Primary Sites. Recovery techniques for these organizations may differ from organizations with a single Primary Site. In this chapter, we will discuss DR solutions. It is not meant to be a comprehensive walk-through for implementing a DR solution. Rather, it will give you an overview of what is required.
Make no mistake. Recovering from a Configuration Manager failure is a complex process. You must be skilled with the product and the integrated components. The process must be well planned in advance. All the information you need should already be available. The next section describes some of the items you should consider.
As a Configuration Manager administrator, you should document your environment thoroughly. Of course, this isn't just part of a DR process. It's just common sense. However, in reality, this is not always the case.
A typical table for a single Primary Site could be as seen in the following (note that the specifications are examples, not recommendations):
SERV01 |
SERV02 | |
---|---|---|
Physical or virtual |
Virtual |
Virtual |
Server specification |
16 GB RAM 2vCPU |
4 GB RAM 2vCPU |
High availability |
Yes, at Hypervisor level |
No |
Role(s) |
Primary Site Server Management Point Database Software Update Point Reporting Services Point Intune Connector |
Management Point Distribution Point Software Update Point |
Configuration Manager version (including SPs and CUs) |
Configuration Manager 2012 R2 SP1 (5.00.8239.1000) | |
Site code |
P01 |
P01 |
Operating system |
Windows Server 2012 R2 (6.3.9200) |
Windows Server 2012 R2 (6.3.9200) |
Drive partitions (examples) |
C: 80GB (OS) E: 80GB (Program files) F: 80GB (Database) L: 30GB (Log files) T: 30GB (Temp DB) |
C: 80GB (OS) E: 80GB (Program files) F: 200GB (Content Library) |
Domain |
|
|
Configuration Manager installation folder |
| |
SQL Server information |
Local SQL Server 2012 SP2 CU4 (11.0.5569.0) |
You may have spotted a reference to High Availability (HA) in the table. Configuration Manager is not a real-time product and a certain amount of downtime can be tolerated in most cases. However, it is still beneficial to build as much redundancy into the solution as possible to try and eliminate, or at least minimize, the requirement for DR. HA is discussed later in this chapter.
A DRP details everything you may need to recover the service after a catastrophic failure. The DRP will include at least the following items:
The DRP should be live and should be updated whenever major changes are made to the Configuration Manager environment.
Note that DR testing is not easy with Configuration Manager. As restored servers generally need to have the same name as the original server, it is not possible to test in production. The only way to test DR properly is to duplicate the production environment as best you can on an isolated network.
When you carry out your DR tests, you should record how long it takes for full recovery. This is an important piece of information to be able to share with management when you are looking for approval of your DRP.
3.147.76.89