CHAPTER 16

image

Database Tuning

Citius, Altius, Fortius—Swifter, Higher, Stronger

—Motto of the Olympic Games

Database tuning can be a complex exercise, but it can be facilitated by a systematic approach. This chapter describes a systematic five-step approach to performance tuning. It also presents the most important tools provided by Oracle to help with performance tuning; Statspack is given equal time because tools such as Automatic Workload Repository (AWR) and Automatic Database Diagnostic Monitor (ADDM) require licenses and are not available at most sites.1 In particular, you learn a powerful method of mining the Statspack and AWR repositories for data on performance trends. This data can be graphed by using tools such as Microsoft Excel.

Of course, I can only give you a brief introduction to the vast topic of database tuning. If you want to continue your study of this topic, the best book you will every find is Database Performance Tuning Guide in the Oracle documentation set. I recommend that you do the exercises listed at the end of the chapter—they will familiarize you with the most common tools for database tuning. The best way to learn is by doing.

Image Tip  The SQL scripts in this chapter use blank lines for extra readability. When executing them in SQL*Plus, use the command set sqlblanklines on to prevent them from erroring out.

Using a Systematic Five-Step Tuning Method

Once upon a time, a database-tuning expert saw a drunken man on his hands and knees searching for something under a bright streetlight. “Have you lost something?” he asked the drunk.

“My keys,” said the drunk.

The performance expert offered to help, and he too got on his hands and knees to search for the lost keys. After concluding that no keys were to be found under the streetlight, he began questioning the drunk. “When was the last time you remember seeing your keys?” he asked.

“At the bar,” said the drunk.

“Then why are you searching here?”

“Because it is so much brighter here!”

The story is fictional, and the moral is that you must concentrate your efforts in the appropriate place. You shouldn’t focus your exertion wherever it is most convenient to do so—the problem might not be in the Oracle database at all.

Here is an example of a case where poor performance was reported but the database was not the problem. The output shown in Listing 16-1 was produced by using the dbms_monitor.session_trace:enable procedure to trace a poorly performing batch process and then using the tkprof utility to summarize the trace data. The session was traced for about 40 minutes, and the data shows that the Oracle database spent only 140.08 seconds executing SQL queries. The bulk of the time—2,138.67 seconds—was spent waiting for further instructions from the program. This clearly showed that the database was not the problem.

In his book The Art and Science of Oracle Performance Tuning, Christopher Lawson outlines a systematic five-step method for solving any performance-tuning problem:2

  1. Define the problem. This requires patient listening, skillful questioning, and even careful observation. “The database is slow” is an example of a poorly defined problem. “The database is slow between 10 a.m. and 11 a.m. every day” is more precise. “This report now takes twice as long as it used to take only a week ago” is another example of a precisely defined problem. Ask the user for the history of the problem. Ask what previous efforts have been made to solve the problem. Ask what changed recently in the environment—for example, software or hardware upgrades. Ask whether all users are affected or only some. Ask whether the problem occurs at specific times of the day or week. Ask whether all parts of the application are equally affected or just certain parts.
  2. Investigate the problem, and collect as much pertinent evidence as possible. Examples include Statspack reports, graphs of database workload and DB time, and session traces. Study the environment, and find out what may be affecting the database—for example, other databases and applications may be competing for resources with your database.
  3. Analyze the data collected in the second step, and isolate the cause of the performance problem. This is often the most challenging part of the performance-tuning exercise. If the cause of the problem is not found, go back to step 1 or step 2.
  4. Solve the problem by creating a solution that addresses the root cause. Solutions are not always obvious, and, therefore, this step may require a great deal of ingenuity and creativity.
  5. Implement the solution in a safe and controlled manner. Conduct an appropriate level of testing. Obtain “before” and “after” measurements, if possible, in order to quantify the performance improvement. If the expected improvement is not obtained, go back to step 2.

Oracle Database versions may change, and software tools may change, but the five performance-tuning steps never change. A problem may be simple and require only a few minutes of your time, or it may be tremendously complex and require weeks of your time, but the five steps always remain the same. A simple example follows; you will be asked to duplicate it in the “Exercises” section at the end of the chapter:

  1. The users told the DBA that the problem was restricted to certain parts of the application and that there had not been a problem the previous day. On questioning the application developers, the DBA learned that there had been an application upgrade the previous night.
  2. The DBA used the spreport.sql script to generate Statspack reports for a one-hour period for the previous day and the current day. The DBA also traced a number of database sessions by using the DBMS_MONITOR.SET_SQL_TRACE procedure.
  3. Examination of the Statspack report for the current day showed large numbers of enq: TM - contention events; sessions were waiting, trying to lock an entire table. These events were not observed in the Statspack report for the previous day. The table in question was identified by searching through the trace files.
  4. The enq: TM - contention event indicates an unindexed foreign key. If a foreign key is not indexed, Oracle must lock the entire child table when a record is deleted from the parent table in the foreign-key relationship. The DBA queried the DBA_CONSTRAINTS, DBA_CONS_COLUMNS, and DBA_IND_COLUMNS views to identify the missing index and brought this to the attention of the developers.
  5. All that was left to do was to implement the solution in a safe and controlled manner. The developers followed the necessary change-control procedures before creating the necessary index.

Image Tip  A good question to ask during the investigation phase is, “What changed?”

Analyzing DB Time

The best way to analyze database performance is to find out where the database is spending time. This is powerfully summarized by the Statspack and AWR reports. To demonstrate this, I used the Swingbench tool3 to perform large numbers of transactions in my test database on my laptop. I then used the spreport.sql and awrrpt.sql scripts to generate the Statspack and AWR reports for a one-hour period; both scripts can be found in ORACLE_HOME dbmsadmin. Listing 16-2 shows the first page of the Statspack report, and Listing 16-3 shows the first page of the corresponding AWR report.

The most important part of the Statspack report is the Top 5 Timed Events section. It is preceded by details concerning the workload processed by the database. In Listing 16-2, you can see that the DB time was 56.77 minutes during an elapsed interval of 59.92 minutes. DB time is the sum of the response times of all database calls made by the connected sessions; it can be much larger than the clock time if multiple sessions are simultaneously active. Note that actual CPU usage was only 55 seconds during the entire period—just 4.1% of the total DB time. The rest of the time was spent waiting for overhead activities such as reading data from disk (db file sequential read operations) and writing redo information to the online redo logs (log file sync operations). Each db file sequential read operation retrieves one database block from disk. The average time for each operation was 4 milliseconds. The average time for each log file sync operation was 33 milliseconds—too large for comfort—but such operations accounted for only 18.2% of DB time.

Listing 16-3 shows the first page of the AWR report for the same period. The format is very similar to that of the Statspack report, but there is some variance in the numbers because the methods used by AWR are different than those of Statspack.

The best way to tune this database would seem to be to get faster disks; another alternative would be to increase the size of the buffer cache so that less data has to be read from disk. The ADDM report automatically analyzes the components of DB time and makes appropriate suggestions, some of which are shown in Listing 16-4. The ADDM script is addmrpt.sql, and it too can be found in ORACLE_HOME dbmsadmin.

Understanding the Oracle Wait Interface

Statspack gets its data from the Oracle wait interface—the collection of dynamic performance views that track all aspects of database activity. Performance-monitoring tools can query these tables by using SQL. Some of the most important views in the wait interface are described in the following sections.

V$SYSSTAT and V$SESSTAT

V$SESSTAT and V$SYSSTAT contain almost 400 counters describing every aspect of the workload handled by the database. V$SESSTAT contains session-level counters—almost 400 of them for each session—and V$SYSSTAT contains system-level counters, each one being the total of all the session-level counters of the same type. In other words, each row in V$SESSTAT gives you the value of a counter for one session, and each row in V$SYSSTAT gives you the total value for all sessions. Four of these counters (logons current, opened cursors current, session cursor cache current, and workarea memory allocated) represent current levels; the rest are cumulative counters that keep incrementing from the time the database was started.

Because the majority of these counters are cumulative, the way to monitor the rate of activity is to observe the values of the counters at fixed intervals and calculate the difference. This is the approach used by the Statspack report. Some of the most important counters are described here:

  • CPU used by this session is the amount of CPU used by one session in the case of V$SESSTAT. It is the amount of CPU used by all sessions put together in the case of V$SYSSTAT.
  • consistent gets and db block gets are the number of blocks read in consistent and current mode, respectively; current mode is used to find the most up-to-date version of a block, and consistent mode is used to find the state of a block at a prior point in time. The sum of consistent gets and db block gets is the number of logical reads performed by the database.
  • physical reads is the number of blocks that are not found in the memory cache and have to be physically retrieved from the disk.
  • physical writes is the number of modified blocks that are written to disk.
  • user commits and user rollbacks track transaction rates.

A complete list of the workload metrics tracked in the V$SYSSTAT and V$SESSTAT views is available in Oracle Database 12c Reference.

V$SESSION_EVENT and V$SYSTEM_EVENT

Whereas V$SESSTAT and V$SYSSTAT track the workload handled by the database, V$SESSION_EVENT and V$SYSTEM_EVENT track the amount of time spent waiting in over 100 different categories—for example, waiting for I/O or waiting for a lock. All the counters are cumulative; V$SESSION_EVENT has session-level counters, and V$SYSTEM_EVENT has system-level counters. You saw some examples of these counters in the Top 5 Timed Events example (Listing 16-2) earlier in this chapter; a complete list and explanation of each wait event is provided in Oracle Database 11g Reference. Here are some more examples:

  • db file sequential read tracks the amount of time spent reading single blocks from tables.
  • db file scattered read tracks the amount of time spent performing full table scans.
  • enq: TX - contention is the amount of time spent waiting for row locks.
  • enq: TM - contention is the amount of time spent waiting for table-level locks.

Mining the Statspack Repository

Statspack gets its data by performing snapshots of the dynamic performance tables such as V$SYSSTAT and V$SYSTEM_EVENT. This data is stored in tables that mimic the structure of the dynamic performance tables; for example, the STATS$SYSSTAT table corresponds to the V$SYSSTAT view, and the STATS$SYSTEM_EVENT table corresponds to the V$SYSTEM_EVENT view.

The Statspack repository can be mined for historical data that can be graphed using tools such as Microsoft Excel. The graphs show any historical trends and repeating patterns.

Image Tip  A stable database exhibits distinct repeating patterns; for example, the workload may peak during the day and subside at night. These patterns are the signature of the database. Deviations from established patterns can be a sign that something is wrong with the database. The Statspackreports for the affected periods can be analyzed and compared with the reports for baseline periods.

In Chapter 10, you saw examples of SQL queries to mine the Statspack repository. Listing 10-8 showed an SQL query to track logical reads per second and physical reads per second over a period of time. That query is a good way to monitor trends and changes in the database workload; you can modify it to include other workload metrics. The corresponding graph is shown in Figure 10-2. Listing 10-9 showed an SQL query to extract performance information on specific SQL queries of your choice.

In Listing 16-5, you see another valuable SQL query; it provides the distribution of DB time for each available time period. It is a fairly long query, but you can download the text from the Source Code section of the Apress web site (www.apress.com). You have to specify a range of snapshots; the appropriate range can be obtained from the STATS$SNAPSHOT table. As explained earlier in this chapter, the distribution of DB time is precisely the information you need to tune the database. CPU time is obtained from the STATS$SYSSTAT table, and the wait time for each type of event is obtained from the STATS$SYSTEM_EVENT table. The classifications in the V$EVENT_NAME view are used to group events into categories such as Application, Concurrency, Commit, User IO, and System IO. The PIVOT operator is used to convert rows into columns; this makes it easy to perform further analysis on the data. The LAG function is used to compute the amount of increase in each row of data. Finally, you divide CPU time and wait time by the length of the observed interval to compute average active sessions for each category of time. The AWR analog of this query is available in the Google group for this book (https://groups.google.com/forum/#!forum/beginning-oracle-database-12c-administration).

Sample output of this query is shown in Figures 16-1 and 16-2. Each graph shows data for the same database, one week at a time. Even if you knew very little about performance tuning, you could instinctively tell which week was a good week for the database and which week was a bad week. The graph in Figure 16-1 exhibits very regular patterns, indicating database stability. The graph in Figure 16-2 is for another week in the same database, but the regular patterns are missing. I hope these examples convince you of the importance of becoming familiar with the workload and performance patterns in your database.

9781484201947_Fig16-01.jpg
Figure 16-1. A good week for the database
9781484201947_Fig16-02.jpg
Figure 16-2. A bad week for the database

It is also useful to see at a glance the top five timed events for a range of time periods. The query shown in Listing 16-6 lets you do exactly that. You can download the text of this query and the AWR analog from the Google group for this book (https://groups.google.com/forum/#!forum/beginning-oracle-database-12c-administration). Listing 16-7 shows some sample output; it is for the first six periods in the graph shown in Figure 16-1.

Using the Statspack Report

A Statspack report is generated from two snapshots of the database. These snapshots are a copy of the contents of the various dynamic performance tables that are continuously accumulating information about activity in the database. AWR and ADDM were offered as replacements for Statspack beginning with Oracle Database 10g and are no longer documented in the Oracle manuals. However, you can use AWR and ADDM only with Enterprise Edition and only if you have a license for Diagnostics Pack—an extra-cost option available only with Enterprise Edition. In my experience, very few organizations have the requisite licenses. I therefore emphasize the use of Statspack and use it a lot in my own work. Note that Oracle did make several incremental enhancements to Statspack in Oracle Database 10g and 11g, thus further increasing its usefulness and relevance. However, you will have to refer to Chapter 21 of Oracle Database 9i Performance Tuning Guide and Reference for documentation on Statspack because it is not documented in the Oracle Database 10g and 11g manuals. A useful introduction to Statspack as well as installation instructions can be found in spdoc.txt in the ORACLE_HOME dbmsadmin folder.

The Statspack report contains a wealth of other information that can help you diagnose and solve your performance problem. In Listing 16-2, you saw the first page of a sample Statspack report, of which the most interesting section was the Top 5 Timed Events section. Here is a list of other interesting sections of the Statspack report:

  • SQL Ordered by CPU lists the SQL queries that consume the most CPU.
  • SQL Ordered by Elapsed Time lists the SQL queries that take the longest.
  • SQL Ordered by Gets lists the SQL queries that perform the most logical reads.
  • SQL Ordered by Reads lists the SQL queries that perform the most physical reads.
  • Instance Activity Stats lists workload data from the V$SYSSTAT dynamic performance view.
  • File Read Histogram Stats lists workload and performance metrics for each file in the database.
  • Segments by Logical Reads and Segments by Physical Reads show which tables and indexes have the most logical reads and physical reads. They are only available in level-7 snapshots. Refer to Chapter 21 of Oracle Database 9i Performance Tuning Guide and Reference for information on the different snapshot levels and how to change them.

Summary

Here is a short summary of the concepts touched on in this chapter:

  • When a performance problem is reported, the database might not be the cause. A systematic method should be used to identify the root cause.
  • The five steps for addressing any performance problem are define, investigate, analyze, solve, and implement.
  • The best way to analyze database performance is to find out where the database is spending its time. This is summarized by the Top 5 Timed Events section of the Statspack report.
  • Statspack gets its data from the Oracle wait interface—the collection of dynamic performance views that track all aspects of database activity. Important views include V$SYSSTAT, V$SESSTAT, V$SESSION_EVENT, and V$SYSTEM_EVENT.
  • The Statspack repository can be mined for historical data that can be graphed using tools such as Microsoft Excel. This is a good way to monitor trends and changes in the database workload and database performance. You can also track performance of individual queries over time. It is also useful to see at a glance the top five timed events for a range of time periods.
  • A stable database exhibits distinct repeating patterns; for example, the workload may peak during the day and subside at night. These patterns are the signature of the database. Deviations from established patterns can be a sign that something is wrong with the database; the Statspack reports for the affected periods can be analyzed and compared with the reports for baseline periods.
  • The Statspack report contains a wealth of information that can help you diagnose and solve a performance problem. Examples include SQL Ordered by CPU, SQL Ordered by Elapsed Time, SQL Ordered by Gets, SQL Ordered by Reads, Instance Activity Stats, File Read Histogram Stats, Segments by Logical Reads, and Segments by Physical Reads.

Exercises

  1. In this exercise, you will duplicate the locking problem described in this chapter. You will need to use the HR sample schema as well as Statspack. If you have not already installed Statspack in your test database, run the scripts spcreate.sql and spauto.sql in the $ORACLE_HOME/rdbms/admin folder. The instructions are in spdoc.txt in the same folder.
    1. Query the DBA_CONSTRAINTS and DBA_CONS_COLUMNS views, and confirm that the manager_id column of the employees table in the HR schema is a foreign key that links to the employee_id column of the same table. Use the following commands:
      SELECT constraint_name
        FROM dba_constraints
       WHERE owner = ’HR’
         AND table_name = ’EMPLOYEES’
         AND constraint_type = ’P’;

      SELECT constraint_name,
             owner,
             table_name
        FROM dba_constraints
       WHERE constraint_type = ’R’
         AND r_owner = ’HR’
         AND r_constraint_name = ’EMP_EMP_ID_PK’;

      SELECT column_name,
             position
        FROM dba_cons_columns
       WHERE owner = ’HR’
         AND constraint_name = ’EMP_EMP_ID_PK’;

      SELECT column_name,
             position
        FROM dba_cons_columns
       WHERE owner = ’HR’
         AND constraint_name = ’EMP_MANAGER_FK’;
    2. Drop the emp_manager_ix index on the manager_id column of the employees table. Use the command drop index hr.emp_manager_ix.
    3. Find the name of each employee’s manager. Use the following command:
        SELECT e2.first_name AS manager_first_name,
               e2.last_name AS manager_last_name,
               e1.first_name AS employee_first_name,
               e1.last_name AS employee_last_name
          FROM hr.employees e1,
               hr.employees e2
         WHERE e2.employee_id = e1.manager_id
      ORDER BY 1, 2, 3, 4;
    4. Connect to the database as the HR user, and update one of the records in the employees table. For example, issue the command update employees set phone_number=phone_number where first_name=&x2019;Alexander&x2019; and last_name=&x2019;Hunold&x2019;. Do not commit this transaction.
    5. Connect to the database again and attempt to delete the manager of the employee whose record you just updated. For example, issue the command delete from employees where first_name=&x2019;Lex&x2019; and last_name=&x2019;De Haan&x2019;. Oracle will block this command.
    6. Connect to the database as the system user. Query the v$session_wait table, and find the session ID of the blocked session. Because you know it is waiting for an enq: TM - contention event to complete, you can use the command select sid as session_id, wait_time_micro from v$session_wait where event = &x2019;enq: TM - contention&x2019;.
    7. Query the V$SESSION view, and obtain the serial number of the blocked process. Use the command select serial# as serial_num from v$session where sid=&session_id.
    8. Begin tracing the blocked session by using the DBMS_MONITOR.SESSION_TRACE_ENABLE session. Issue the command execute dbms_monitor.session_trace:enable(&session_id, &serial_num, TRUE). Refer to Oracle Database 11g PL/SQL Packages and Types Reference for a description of the session_trace:enable procedure.
    9. Generate a Statspack snapshot by using the statspack.snap procedure. Using a special version of the procedure, capture performance information for the blocked session in addition to system-wide information; issue the command execute statspack.snap(i_session_id => &session_id).
    10. Resume the first session, and issue the rollback command. This will release the blocked session, and Oracle will display the error ORA-02292: integrity constraint (HR.EMP_MANAGER_FK) violated - child record found.
    11. Identify the trace file for the second session by using the command select tracefile from v$process where addr = (select paddr from v$session where sid = &session_id). Review the trace file, and locate the lines similar to the following line:
      WAIT #4: nam=’enq: TM - contention’ ela= 855599225
      name|mode=1414332421 object #=71472
      table/partition=0 obj#=-1 tim=198871373294
    12. Confirm that the object number listed in the trace file corresponds to the employees table. Use the command select owner, object_type, object_name from dba_objects where object_id=&object_id.
    13. Perform the experiment once more, and confirm that additional wait messages appear in the trace file.
    14. Navigate to the folder where the trace file is located, and invoke the tkprof utility from the command line. You have to specify the name of the trace file and choose a name for the output file. Review the output file created by tkprof.
    15. Generate another Statspack snapshot by using the statspack.snap procedure with the same parameter used in step 9. Use the command execute statspack.snap(i_session_id => &session_id).
    16. Connect to the database as the system or perfstat user, and generate a Statspack report from the snapshots obtained in step i and step o. Run the spreport.sql script by using the command @?/rdbms/admin/spreport.sql.
    17. Review the Statspack report. In particular, review the Top 5 Timed Events section. Also pay attention to the sections of the report that provide information on the blocked session you were tracking.
    18. Connect to the database as the HR user, and re-create the index you dropped at the start at the exercise. Use the command create index hr.emp_manager_ix on hr.employees (manager_id).
  2. Download the Swingbench tool, and run the Order Entry benchmark for an hour. Review the AWR, ADDM, and Statspack reports for the run. Tune the database, and attempt to improve performance.

Further Reading

  • The free Oracle Database 12c Performance Tuning Guide offers a detailed and comprehensive treatment of performance-tuning methods, and the price is right.
  • Chapter 11 of Oracle Database 9i Performance Tuning Guide and Reference explains how to install and use Statspack. Note that Statspack is not documented in the reference guides for Oracle Database 10g, 11g, and 12c, even though it has been upgraded for all these versions. Refer to the upgrade notes in $ORACLE_HOME/rdbms/admin/spdoc.txt in your virtual machine.

Footnotes

1In 2007, the president of a well-known consulting company reported that only 2 of its 70 clients had licenses for Diagnostics Pack.

2This was presented in Chapter 11 as a general method of solving any performance problems; refer to the flowchart shown in Figure 11-2.

3The Swingbench tool can be downloaded from www.dominicgiles.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.70.88