Distributed Replay
Distributed Replay is a utility supplied with SQL Server that offers you the ability to replay traces on one or more clients and dispatch the events to a target server. This is useful for scenarios in which you are testing the impact of software updates, such as OS-level and SQL Server–level service packs; testing performance tuning; load testing; and consolidation planning.
In older versions of SQL Server, SQL Trace and its GUI, Profiler, were commonly used to both generate traces and replay them. This had limitations, however, such as the overhead to capture the trace and the inability to replay the trace on multiple servers. For this reason, SQL Trace and Profiler have been deprecated for the Database Engine, and it is recommended that you only use Profiler for tracing Analysis Services activity. Instead, the recommendation is that you use Extended Events (discussed in Chapter 19) to capture traces with less overhead and then use Distributed Replay to replay them.
Note For the demonstrations in this chapter, we will be using four servers, named Controller, Client1, Client2, and Target. Each server has a default instance of SQL Server installed, and all servers are part of the PROSQLADMIN domain. The Distributed Replay Controller feature is installed on the controller instance and the Distributed Replay Client is installed on the Client1 and Client2 instances. Although the Distributed Replay Client should be installed on the target instance for application compatibility testing, in our scenarios, we look at performance testing, while simulating concurrent activity, and therefore the Distributed Replay Client is not installed on the target, since it is not recommended for these purposes. The Distributed Replay Administration Tool is installed as part of the Management Tools feature, so this feature has been installed on the controller.
Distributed Replay Concepts
To harness the power of Distributed Replay, it is important that you understand its concepts, such as the controller, the clients, and the target servers. It is also important to understand the architecture of Distributed Replay. These topics are discussed in the following sections.
The Distributed Replay Controller sits at the heart of the Distributed Replay infrastructure and is used to orchestrate the activity on each of the Distributed Replay clients. When the Distributed Replay Controller service starts, it pulls its settings from the DReplayController.config file, so you may need to edit this file to configure logging options. We discuss this later in the chapter. There is only one controller in a Distributed Relay topology.
The Distributed Replay client(s) are the server(s) you use to replay the workloads. When the Distributed Replay Client service starts, it pulls its configuration from the DReplayClient.config file, so you may need to edit this file to configure the logging options and folders where the trace results and intermediate files will be stored. This is discussed later in the chapter. You can configure multiple clients in a Distributed Replay topology with support for a maximum of 16 clients.
The target is the instance in which the trace is replayed. There is always one target in a Distributed Replay topology. Because you can have a ratio of multiple clients to one target, you can use Distributed Replay as a tool for load testing with commodity hardware or to simulate concurrent workloads.
The Distributed Replay Administration Tool is a command-line tool that allows you to prepare and replay traces. The executable is called DReplay.exe; it relies on the DReplay.exe.preprocess.config file to obtain the configuration it needs to process intermediate trace files and the DReplay.exe.replay.config file to obtain the configuration it needs to replay the trace. This means that you may need to edit each of these configuration files to specify settings, such as whether or not system activity should be included and specify query timeout values, which we discussed later in this chapter.
Distributed Replay Architecture
The diagram in Figure 20-1 gives you an overview of the Distributed Replay components and how the replay process works across the preprocess and replay phases. In the preprocess phase, an intermediate file is created in the working directory of the controller. In the Replay phase, dispatch files are created at the clients before the events are dispatched to the target.
Figure 20-1. Distributed Replay architecture
Configuring the Environment
Before we begin to replay traces, we need to configure the Distributed Replay controller, the Distributed Replay clients and the Distributed Replay Administration Tool. These activities are discussed in the following sections.
Configuring the Controller
You can find the DReplayController.config file within the ToolsDReplayController folder, which resides in the 32-bit shared directory of SQL Server. Therefore, if SQL Server is installed with default paths, the fully qualified path is C:Program Files (x86)Microsoft SQL Server120ToolsDReplayControllerDReplayController.config. You can use this configuration file to control the logging level. These are the possible options:
The default contents of the DReplayController.config file are shown in Listing 20-1.
Because the service pulls this logging level from the configuration file at the point when the service starts, if you change the logging level after starting the service, then you need to restart the service. The log itself is created when the service starts, and it include startup information, such as the service account being used. You can find the log in the DReplayControllerLog folder named DReplay Controller Log_<Uniquifier>. A new log is generated each time the service starts.
If you are running the Distributed Replay Client service under a different service account than the Distributed Replay Controller service, then you also need to configure DCOM (Distributed Component Object Model) permissions on the Distributed Replay Controller service. You can do this via Component Services, which can be found in Administrative Tools in Windows.
Once Component Services has been invoked, you need to drill through Console Root | Component Services | Computers | My Computer | DCOM Config and then select Properties from the DReplayController context menu. This causes the Properties dialog box to be invoked. From here, you should navigate to the Security tab, which is shown in Figure 20-2.
Figure 20-2. The Security tab
Now use the Edit button in the Launch And Activation Permissions section to launch the Permissions dialog box. Here, use the Add button to add the service account of the Distributed Replay Client service before granting it the Local Activation and Remote Activation privileges, as illustrated in Figure 20-3.
Figure 20-3. The Permissions dialog box (Launch and Activation permissions)
You now need to repeat this process in the Access Permissions section to grant the service account the Local Access and Remote Access privileges, as shown in Figure 20-4.
Figure 20-4. The Permissions dialog box (Access Permissions)
You also need to ensure that the Distributed Replay Client service account is added to the Distributed COM Users Windows Group on the server running the Distributed Replay Controller. Once you have applied these changes, you need to restart both the Distributed Replay Controller service and the Distributed Replay Client service(s) for the changes to take effect.
Tip In regard to firewall configuration, the controller and the client communicate using port 135 and dynamic ports; therefore, you must ensure that these are open between the servers. Opening the dynamic port range can violate some organization’s security best practice, however. The workaround is to configure Windows Firewall so it allows the Distributed Replay executables to communicate through any port. This poses its own issues, however, because some corporate firewalls are not configured to offer this functionality, meaning that even though Windows Firewall does allow the traffic through, the packets may be dropped at the corporate firewall level. For more information on configuring Windows Firewall, please refer to Chapter 4.
Configuring Clients
You can find the DReplayClient.config file in the ToolsDReplayClient folder, which resides in the 32-bit shared directory of SQL Server. Therefore, if SQL Server is installed with default paths, the fully qualified path would be C:Program Files (x86)Microsoft SQL Server120ToolsDReplayControllerDReplayClient.config. You can use this configuration file to configure the settings detailed in Table 20-1.
Table 20-1. DReplayClient.config Options
Listing 20-2 displays an edited version of the DReplayClient.config file, which has been configured to point to our controller server and the Working and Results folders.
Tip You must start the Distributed Replay Controller service before you start the Distributed Replay Client service on each client.
Because we are using two clients, we need to repeat these activities on each. In our environment, these clients are the servers, Client1 and Client2.
Tip In regards to firewall configuration, the client and the target communicate using the SQL Server ports. In a standard configuration, this means using TCP 1433 and UDP 1434 for the browser service, if you are using a named instance. If you are using nonstandard ports, configure the firewall accordingly.
Configuring the Replay
The replay is started using the Administration tool, which relies on two configuration files, DReplay.exe.preprocess.config and DReplay.exe.replay.config. The first of these configuration files controls the building of the intermediate files and the second controls replay options and output options. Both files are located within the 32-bit shared directory, so with a default installation, the fully qualified paths are C:Program Files (x86)Microsoft SQL Server120ToolsBinnDReplay.exe.preprocess.config and C:Program Files (x86)Microsoft SQL Server120ToolsBinnDReplay.exe.replay.config, respectively. Table 20-2 details the options you can configure in DReplay.exe.preprocess.config.
Table 20-2. DReplay.exe.preprocess.config Options
Option |
Description |
---|---|
IncSystemSession |
Specifies if activity captured from system sessions should be included in the replay. |
MaxIdleTime |
Specifies a limit for the amount of idle time in seconds.
|
The default contents of the DReplay.exe.preprocess.config file are shown in Listing 20-3.
Table 20-3 details the options you can configure within DReplay.exe.replay.config.
Table 20-3. DReplay.exe.replay.config Options
Option |
Category |
Description |
---|---|---|
Server |
Replay options |
The serverinstance name of the target server. |
SequencingMode |
Replay options |
Specifies the mode to be used for scheduling events. The possible options are
|
StressScaleGranularity |
Replay options |
When the SequenceMode is set to Stress, StressScaleGranularity determines how to scale activity on a SPID.
|
ConnectTimeScale |
Replay options |
A percentage value that indicates if connection time should be reduced during the replay when the SequenceMode is Stress. 100 indicates 100 percent of connection time is included. Lower values reduce the simulated connection times accordingly. |
ThinkTimeScale |
Replay options |
A percentage value that indicates if the user thinks time should be reduced during the replay when SequenceMode is Stress. 100 indicates 100 percent of think time is included, so transactions replay at the speed they were captured. Specifying lower values reduces the interval. |
UseConnectionPooling |
Replay options |
Specifies if connection pooling should be used on the clients. A connection pool is a cache of connections that subsequent connection can reuse. |
HealthmonInterval |
Replay options |
When SequenceMode is set to Synchronization, HealthmonInterval determines how often the health monitor runs, specified in seconds. -1 indicates that the health monitor is disabled. |
QueryTimeout |
Replay options |
Specifies the query timeout value in seconds. -1 indicates that it is disabled. |
ThreadsPerClient |
Replay Options |
Specifies the number of threads to use for the replay on each client. |
RecordRowCount |
Output Options |
Specifies if you should include a row count for each results set. |
RecordResultSet |
Output Options |
Specifies if you should save the contents of each record set. |
Listing 20-4 shows an example of the DReplay.exe.replay.config file, which has been modified for our environment to allow us to perform a performance test by simulating multiple connections.
Working with Distributed Replay
Now that the Distributed Replay utility is configured, we create a trace using Extended Events before we synchronize the target. We then use Distributed Replay to replay the trace in order to test performance tweaks, simulating concurrent activity on the two clients. Before we do this, however, we create the Chapter20 database on the controller using the script in Listing 20-5.
Because we replay a trace that includes DML statements, we need to synchronize our target database immediately before we start capturing the trace to ensure that the IDs align. We perform this task by backing up the Chapter20 database on the controller and restoring it on the target. We can achieve this by using the script in Listing 20-6.
Tip Remember to change the file locations to match your own configuration before running the script.
Ideally, the database should have the same DatabaseID on each server, but if this is not feasible, then ensure that the Database_name action is captured in the event session for mapping purposes. You should also ensure that any logins contained within the trace are created on the Target server, with the same permissions and the same default database. If you fail to do this, you get replay errors.
Since our demonstration is for testing performance enhancements, you may wish to take this opportunity to create appropriate indexes on the Customers and Addresses tables in the synchronized database. You can find a script including index suggestions that are appropriate for our trace workload in Listing 20-7.
Creating a Trace
Let’s now create our Extended Event session and start the trace. You should configure the Event session to capture the events, event fields, and actions detailed in Table 20-4.
Table 20-4. Events and Event Fields
Event |
Event Fields |
Actions |
---|---|---|
assembly_load |
- |
|
attention |
- |
|
begin_tran_completed |
statement |
|
begin_tran_starting |
statement |
|
commit_tran_completed |
statement |
|
commit_tran_starting |
statement |
|
cursor_close |
- |
|
cursor_execute |
- |
|
cursor_implicit_conversion |
- |
|
cursor_open |
- |
|
cursor_prepare |
- |
|
cursor_recompile |
- |
|
cursor_unprepare |
- |
|
database_file_size_change |
database_name |
|
dtc_transaction |
- |
|
exec_prepared_sql |
- |
|
existing_connection |
|
|
login |
|
|
logout |
- |
|
prepare_sql |
- |
|
promote_tran_completed |
- |
|
promote_tran_started |
- |
|
rollback_tran_completed |
statement |
|
rollback_tran_started |
statement |
|
rpc_completed |
|
|
rpc_starting |
data_stream |
|
save_tran_completed |
statement |
|
save_tran_started |
statement |
|
server_memory_change |
- |
|
sql_batch_completed |
batch_text |
|
sql_batch_starting |
batch_text |
|
sql_transaction |
- |
|
trace_flag_changed |
- |
|
unprepare_sql |
- |
|
So these are the events, event fields, and actions that we include in our session definition (see Listing 20-8).
Tip If you do not include all recommended events, event fields, and actions, then errors may be thrown when you convert the .xel file to a .trc file. A workaround here is to use trace flag -T28 when running ReadTrace in order to ignore the RML requirements. This can lead to unpredictable results, however, and is not advised.
To generate activity to be traced, depending on your scenario, you may wish to capture real user activity; to do so, you need to use a tool, such as SQLStress, or script activity. In our case, we script activity to be traced (see Listing 20-9).
We can now stop our trace by using the command in Listing 20-10.
Replaying the Trace
Now that we have captured a trace, we convert it to a .trc file and then preprocess it before replaying it by using the Distributed Replay Administration Tool from the command line.
In order to use our .xel file with Distributed Replay, we first need to convert it to a .trc file. We do this with the help of the readtrace.exe command-line tool, which uses RML (Replay Markup Language) to convert the data. ReadTrace ships as part of Microsoft’s RML Utilities for SQL Server toolset, which you can download from www.microsoft.com/en-gb/download/details.aspx?id=4511. The following example assumes that you have installed this toolkit.
Here, we convert our trace file by navigating to the C:Program FilesMicrosoft CorporationRMLUtils folder and running ReadTrace.exe with the arguments detailed in Table 20-5.
Table 20-5. ReadTrace Arguments
Argument |
Description |
---|---|
-I |
The fully qualified file name of the .xel file to convert. |
-O |
The folder to output the results to. This includes the log file as well as the .trc file. |
-a |
Prevent analytical processing. |
-MS |
Mirror to a single .trc file, as opposed to separate files for each SPID. |
Tip You can find a full description of all ReadTrace arguments in the RML Utilities for SQL Server help file.
We run ReadTrace by using the command in Listing 20-11.
Tip Remember to change the filename and path of the input and output files before running this script. Even if you have used the same location, your .xel file name will include a different uniquifier.
The Administration Tool is a command line utility, which can be run with the options detailed in Table 20-6.
Table 20-6. Administration Tool Options
Option |
Description |
---|---|
Preprocess |
Prepares the trace data by creating the intermediate files |
Replay |
Dispatches the trace to the clients and begins the replay |
Status |
Displays the controller’s current status |
Cancel |
Cancels the current operation |
When run with the preprocess option, the Administration Tool accepts the arguments detailed in Table 20-7.
Table 20-7. Preprocess Arguments
Argument |
Full Name |
Description |
---|---|---|
-m |
Controller |
The name of the server hosting the Distributed Replay Controller. |
-i |
input_trace_file |
The fully qualified file name of the trace file. If there are rollover files, then specify a comma-separated list. |
-d |
controller_working_dir |
The folder where intermediate files are stored. |
-c |
config_file |
The fully qualified file name of the DReplay.exe.preprocess.config configuration file. |
-f |
status_interval |
The frequency with which status messages are displayed, specified in milliseconds. |
To preprocess our trace file, we can use the command in Listing 20-12. This process creates an intermediate file, which can then be dispatched to the clients, ready for replay.
Tip Change the file paths to match your configuration before you run this script.
Starting the Replay
You can start the replay using the Distributed Replay Administration Tool. The arguments accepted when the tool is used with the replay option are detailed in Table 20-8.
Table 20-8. Replay Arguments
Argument |
Full Name |
Description |
---|---|---|
-m |
Controller |
The name of the server hosting the Distributed Replay Controller. |
-d |
controller_working_dir |
The folder where intermediate files are stored. |
-o |
output |
Specifies that client’s replay activity should be captured and saved to the Results directory. |
-s |
target_server |
The serverinstance name of the Target server. |
-w |
clients |
A comma-separated list of clients. |
-c |
config_file |
The fully qualified name of the DReplay.exe.replay.config configuration file. |
-f |
status_interval |
The frequency at which to display the status, specified in seconds. |
Therefore, to replay the trace, using Client1 and Client2, against our Target server, we use the command in Listing 20-13.
Tip The first line of the output may indicate that no events have been dispatched. This is not an issue—it just means that the event dispatch has not yet started.
Summary
Distributed Replay provides a mechanism to replay traces captured with either Profiler or Extended Events. Unlike its predecessor, Profiler, which is deprecated for use with the Database Engine, Distributed Replay can replay the workload from multiple servers, which allows you to perform load testing and simulate multiple concurrent connections.
The controller is the server running the Distributed Replay Controller service, which synchronize the replay and can be configured to work in two different modes; stress and synchronization. In stress mode, the controller fires the events as quickly as possible, while in synchronization mode, it fires the events in the order in which they were captured.
The clients are the servers running the Distributed Replay Client service, which replays the trace. Distributed Replay supports a maximum of 16 clients. The target is the instance to which the events are dispatched by the clients.
Although it is possible to replay a trace captured in Profiler, Extended Events use less resources and provide more flexibility. Therefore, consider using this method to capture the trace. In order to replay an Extended Event session with Distributed Replay, however, you first need to convert the .xel file to a .trc file. You can do this using RML Utilities, which are available for download at www.microsoft.com.
The Distributed Replay Administration Tool is a command-line tool that is used to both preprocess and run the trace. When run in preprocess mode, it creates an intermediate file. When run in replay mode, it generates dispatch files on the clients and uses these files to dispatch the events to the target.
3.144.31.163