Speaking of putting your company at risk, if you haven't considered a major disaster or defined a disaster recovery plan, you are doing just this. Disasters come in many flavors. Perhaps it is just a hardware failure of a single server, a power outage, a fire, a tornado, an earthquake, or a terrorist act. Very often a disaster recovery (DR) plan will be devised, but never tested. The mandate we would put on you would be to devise a highly efficient disaster recovery plan in support of your high availability needs and then test it completely! Make sure it considers all aspects of completely being able to come up in an alternate location as smoothly, quickly, and with as little data loss as possible. Defining a disaster recovery plan can be a tedious job because of the potential complexities of your high availability configuration. But, keep in mind, a disaster recovery plan is part of your high availability configuration (or at least it should be). It should also be part of your decision process of picking the high availability solution in the first place.
In general, there will be a handful of things that need to be put together (defined and executed upon). These are
1. | Create a disaster recovery execution tasks/run book. This will include all steps to take to recover from a disaster and cover all system components that need to be recovered. |
2. | Arrange for or procure a server/site to recover to. This should be a configuration that can house what is needed to get you back online. |
3. | Guarantee that a complete database backup/recovery mechanism is in place (including offsite/alternate site archive and retrieval of databases) |
4. | Guarantee that an application backup/recovery mechanism is in place (like COM+ apps, web services, other application components, and so on). |
5. | |
6. | Make sure you can completely configure and open up network/communication lines. This would also include routers configured properly, IP addresses made available, and so on. |
7. | Train your support personnel on all elements of recovery. You can never know enough ways to recovery a system. And, it seems that a system never recovers the same way twice. |
8. | Plan and execute an annual or bi-annual disaster recovery simulation. The one or two days that you do this will pay for itself a hundred times over if a disaster actually occurs. And, remember, disasters come in many flavors. |
Many organizations have gone to the concept of having “hot” alternate sites available via stretch clustering or log shipping techniques. Costs can be high for some of these advanced and highly redundant solutions.
On the more practical side (dollar-wise), if you create some very solid, time-tested mechanisms for re-creating your SQL Server environment, they will serve you well when you need them most. The things to focus on for disaster recovery are
Always generate scripts for as much as possible (anything that was created using a wizard, enterprise manager, and so on). These scripts will save your rear end (can I say that?). These should include
Complete replication buildup/breakdown scripts
Complete database creation scripts (DB, tables, indexes, views, and so on)
Complete SQL login, database user IDs, and password scripts (including roles and other grants)
Linked/remote server setup (linked servers, remote logins)
Log shipping setup (source, target, and monitor servers)
Any custom SQL Agent tasks
Backup/restore scripts
Potentially other scripts, depending on what you have built on SQL Server
Make sure you document all aspects of SQL database maintenance plans that are being used. This includes frequencies, alerts, email addresses being notified when errors occur, backup file/device locations, and so on.
Document all hardware/software configurations used:
Leverage the sqldiag.exe for this (described in the next section)
Record what accounts were used to start up the SQL Agent service for an instance and MS distributed transaction coordinator (MS DTC) service. This is especially important if using distributed transactions and data replication.
My favorite set of SQL Server information that I script and record for a SQL Server instance are
select @@SERVERNAME— Will provide the full network name of the SQL Server and instance.
select @@SERVICENAME— Will provide the registry key under which Microsoft SQL Server is running
select @@VERSION— Provides the date, version, and processor type for the current installation of Microsoft SQL Server
exec sp_helpserver— Provides the server name, the server's network name, the server's replication status, the server's identification number, collation name, and time-out values for connecting to, or queries against, linked servers
exec sp_helplogins— Provides information about logins and the associated users in each database
exec sp_linkedservers— Returns the list of linked servers defined in the local server
exec sp_helplinkedsrvlogin— Provides information about login mappings defined against a specific linked server used for distributed queries and remote stored procedures
exec sp_helpremotelogin— Provides information about remote logins for a particular remote server, or for all remote servers, defined on the local server
exec sp_server_info— Returns a list of attribute names and matching values for Microsoft SQL Server
exec sp_helpdb dbnamexyz— Provides information about a specified database or all databases. This includes the database allocation names, sizes, and locations.
exec sp_spaceused— Provides the actual database usage information of both data and indexes in a database.
Get the current SQL Server configuration values by running sp_configure (from Query Analyzer). Here are the results of running sp_configure with show advanced option specified:
USE master EXEC sp_configure 'show advanced option', '1' RECONFIGURE go EXEC sp_configure Go name minimum maximum config_value run_value ------------------------- ----------- ----------- ------------ ----------- affinity mask -2147483648 2147483647 0 0 allow updates 0 1 0 0 awe enabled 0 1 0 0 c2 audit mode 0 1 0 0 cost threshold for parallelism 0 32767 5 5 Cross DB Ownership Chaining 0 1 0 0 cursor threshold -1 2147483647 -1 -1 default full-text language 0 2147483647 1033 1033 default language 0 9999 0 0 fill factor (%) 0 100 0 0 index create memory (KB) 704 2147483647 0 0 lightweight pooling 0 1 0 0 locks 5000 2147483647 0 0 max degree of parallelism 0 32 0 0 max server memory (MB) 4 2147483647 2147483647 2147483647 max text repl size (B) 0 2147483647 65536 65536 max worker threads 32 32767 255 255 media retention 0 365 0 0 min memory per query (KB) 512 2147483647 1024 1024 min server memory (MB) 0 2147483647 0 0 nested triggers 0 1 1 1 network packet size (B) 512 65536 4096 4096 open objects 0 2147483647 0 0 priority boost 0 1 0 0 query governor cost limit 0 2147483647 0 0 query wait (s) -1 2147483647 -1 -1 recovery interval (min) 0 32767 0 0 remote access 0 1 1 1 remote login timeout (s) 0 2147483647 20 20 remote proc trans 0 1 0 0 remote query timeout (s) 0 2147483647 600 600 scan for startup procs 0 1 1 1 set working set size 0 1 0 0 show advanced options 0 1 1 1 two digit year cutoff 1753 9999 2049 2049 user connections 0 32767 0 0 user options 0 32767 0 0
Disk configurations, sizes, and current size availability (use standard OS directory listing commands on all disk volumes that are being used).
Capture the sa login password and the OS Administrator password so that anything can be accessed and anything can be installed (or re-installed).
Document all contact information for your vendors:
Microsoft support services contacts (for example, list all support agreements such as Premier Product support services)
Storage vendor contact info
Hardware vendor contact info
Offsite storage contact info (to get your archived copy fast)
Network/telecom contact info
Your CTO, CIO, and other senior management contact
CD-ROMs available for everything (SQL Server, service packs, operating system, utilities, and so on)
One good way to get a complete environmental picture is to run the SQLDIAG.exe program provided with SQL Server 2000 on your production box (that you would have to re-create on an alternate site if a disaster occurs). It can be found in the “Binn” directory where all SQL Server executables are (C:Program FilesMicrosoft SQL ServerMSSQL$YourSqlInstanceBinn). It will show you how the server is configured, all hardware and software components (and their versions), memory sizes, CPU types, operating system version and build info, paging file info, environment variables, and so on. If you run this on your production server periodically, it will serve as a good environment documentation to supplement your disaster recovery plan. Open a DOS command prompt and change directory to the SQL Server BINN directory. Then, at the command prompt run SQLDIAG.exe.
C:Program FilesMicrosoft SQL ServerMSSQL$SQLInstanceBinn> sqldiag
The results will be written into a text file named SQLDIAG.TXT at the root (c: by default or the SQL Server log directory) and contain a verbose snapshot of everything that relates to SQL Server (in one way or another). The following is an abbreviated sample of what you will see:
[System Summary] Registry Information -------------------- SOFTWAREMicrosoftMSSQLServerClientConnectTo: ----------------------------------------------- DSQUERY: DBNETLIB DBARCH6000CORE650NET: DBMSSOCN,DBARCH6000CORE650NET \DBARCH6000CORE650NET: DBMSSOCN,\DBARCH6000CORE650NET SOFTWAREMicrosoftMSSQLServerClientDB-Lib: -------------------------------------------- AutoAnsiToOem: ON UseIntlSettings: ON SOFTWAREMicrosoftMSSQLServerSetup: ------------------------------------ Version Information ------------------- ntwdblib.dll: 8.00. 194 ssmsad60.dll: N/A ssmsde60.dll: N/A ssmsrp60.dll: N/A ssmsso60.dll: N/A ssmssp60.dll: N/A ssmsvi60.dll: N/A ssnmpn60.dll: N/A dbmsadsn.dll: 08.00.9030 dbmsdecn.dll: N/A dbmsrpcn.dll: 08.00.9030 dbmssocn.dll: 7.00.819 dbmsspxn.dll: 7.00.819 dbmsvinn.dll: 08.00.9030 dbnmpntw.dll: 08.00.9030 sqlsrv32.dll: 3.81.9042 Item Value OS Name Microsoft Windows 2000 Professional Version 5.0.2195 Service Pack 4 Build 2195 OS Manufacturer Microsoft Corporation System Name SQLDB01 System Manufacturer XYZ System Model XYZ System Type X86-based PC Processor x86 Family 6 Model 9 Stepping 5 GenuineIntel ~1594 Mhz BIOS Version Insyde Software MobilePRO BIOS Version 4.00.00 Windows Directory C:WIN2K System Directory C:WIN2Ksystem32 Boot Device DeviceHarddisk0Partition1 Locale United States User Name C81124-AAdministrator Time Zone Pacific Daylight Time Total Physical Memory 1,048,048 KB Available Physical Memory 703,756 KB Total Virtual Memory 3,569,232 KB Available Virtual Memory 2,961,300 KB Page File Space 2,521,184 KB Page File C:pagefile.sys [Environment Variables] [Services] [Startup Programs] so on..
Again, we suggest that this be done on a regular basis and compared with prior executions to guarantee that you know exactly what you have to have in place in case of disaster.
Plan and execute a complete disaster recovery (DR). This is serious business and many companies around the globe set aside a few days a year to do this exact thing. This test should include:
Simulate a disaster.
Record all actions that get taken.
Time all events from start to finish. Sometimes this means there is someone standing around with a stopwatch.
Hold a post-mortem following the DR simulation.
Many companies tie the results of the DR simulation to the IT group's salaries (their raise percentage). This is more than enough motivation for IT to get this drill right and to perform well. And, it is critical to correct any failures or issues that occurred. The next time might not be a simulation.
3.22.77.63