Understanding SQL

Let’s take a look at the process by which Oracle parses and optimizes SQL. Take the following simple SQL statement:

update
        t1
set         
        small_no = small_no + 0.1
where
        id = 3
and     small_no = 1
;

How much work does Oracle have to do to understand what this statement means, and to work out the best way to run it? Strictly speaking, the first part of the question is about parsing, and the second is about optimizing; unfortunately many people tend to combine the two operations under the single title of “parsing.”

Parsing

The first step, of course, is for Oracle to decide if the text looks like a valid SQL statement—this step is known as syntactic analysis; if it does look like a valid statement, Oracle then has to answer all sorts of questions about the statement, such as the following:

  • What type of object is t1? A table, a view, a synonym, etc.
  • Are id and small_no columns in object t1—(maybe id is a function returning a number)?
  • Are there any constraints on column small_no that need checking on update?
  • Are there any triggers on the object that might have to fire on update?
  • Are there any indexes that have to be maintained as the object is updated?
  • Does this user have the privileges to update the object?

This stage in the examination is the semantic analysis- the combination of syntactic and semantic analysis is what we mean by parsing.

Optimizing

After deciding that the statement is legal and that the user is allowed to run it Oracle then has to start optimizing it, an activity that requires answers to more questions, such as

  • Are there any indexes on columns id or small_no (or both) that might help?
  • What statistics are available for the columns, tables, and indexes?

After this the optimizer has to collect a few more numbers about the system (various parameter values—some dynamically variable—and system statistics) and start running a lot of complicated code to decide on an execution path. But we’ll worry about that stage of the activity later on in the chapter. At present I want to stick with basic parsing.

To demonstrate the amount of work involved in parsing even a simple statement in 11g (11.2.0.2 in this case) I flushed the shared pool (for reasons that I will explain shortly) with a call to alter system flush shared_pool; enabled sql_trace, executed the query above, and ran tkprof against the resulting trace file. The closing section of the tkprof output file held the following summary information:

       1  session in tracefile.
       2  user  SQL statements in trace file.
      15  internal SQL statements in trace file.
      17  SQL statements in trace file.
      17  unique SQL statements in trace file.
     737  lines in trace file.
       0  elapsed seconds in trace file.

There were two “user” SQL statements in the trace file—the first was the alter session set sql_trace true; that I had used to enable sql_trace, and the second was my update statement. However, there were also 15 internal SQL statements—statements run by Oracle to help it gather all the information it needed to understand (parse) and optimize my test statement; statements of this type are generally called sys-recursive statements. If you’re thinking that 15 extra statements is a lot, bear in mind that you haven’t seen the whole picture yet—the following are the activity statistics for those 15 statements:

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
 call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       15      0.01       0.00          0          0          0           0
Execute     99      0.06       0.05          0          0          0           0
Fetch      189      0.03       0.01          0        388          0         633
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      303      0.10       0.07          0        388          0         633

Misses in library cache during parse: 15
Misses in library cache during execute: 15

In the Parse line you can see the number of parse calls made to pass the 15 internal statements to the Oracle engine—but then the total number of execute calls across all those statements is 99, fetching a total of 633 rows of information in 189 fetch calls. The volume of work is surprising.

Actually this demonstration isn’t a completely fair example of the workload involved, and I can highlight the bias I’ve introduced if I take advantage of another table I have in the same schema. The second table is called t1a and it is an exact copy of t1—right down to the definition of the primary key—and if I now run the same update statement substituting t1a as the table name I get the following results in the tkprof output file.

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

 call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       15      0.00       0.00          0          0          0           0
Execute     43      0.00       0.00          0          0          0           0
Fetch       51      0.00       0.00          0        135          0         526
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      109      0.00       0.00          0        135          0         526

Misses in library cache during parse: 0

Although the number of parse calls is the same, there’s an interesting difference in the number of execute and fetch calls. There’s also a significant difference in the number of Misses in library cache. This last detail, by itself, tells us a lot about the way Oracle tries to be efficient when handling SQL statements, and gives us some clues about the caching that goes on for the code we run—if we know how to interpret the output correctly.

Interpreting the tkprof Summaries

Before I ran the first statement, I flushed the shared pool. The reason for doing this was to ensure that any pre-existing versions of the statements were removed from the library cache along with privilege and dependency information for the objects, and all the object definitions had been removed from the dictionary cache.

When I tried to execute my first test statement Oracle had to query the database (or, to be specific, the set of tables known as the data dictionary) to find the definitions of all the objects referenced in the query—so in the trace file we end up with statements like the following:

select
        obj#, type#, ctime, mtime, stime, status,
        dataobj#, flags, oid$, spare1, spare2
from
        obj$
        where
        owner#=:1
and     name=:2
and     namespace=:3
and     remoteowner is null
and     linkname is null
and     subname is null
;

This particular statement tries to check the existence of an object in a given namespace, with a certain name, belonging to a certain schema in the local database; other queries look at tables like tab$ for table information, col$ for column information, ind$ for index information, icol$ for index column information and so on. The reason I’ve picked this particular query as an example, though, is that it ran five times for the first test run, but only twice on the second test run.

In both tests the query ran twice to find some information about the table (t1 / t1a) and its primary key index (t1_pk / t1a_pk)—but in the first test it then ran a total of three more times to find information about a pair of indexes called i_objauth2 and i_objauth1 and a table called trigger$. Oracle needed to learn more about its internal tables and indexes so that it could work out how to run some of the recursive SQL it was using to find out more information about my tables and indexes. This explains why the work done for the first test was higher than the second test. The first test included a lot of “bootstrap” activity, while the second test took advantage of cached information from the first test.

When I say “cached,” I don’t mean data caching, though. In this case we’re caching metadata (information about the data) and code—and that’s what the dictionary cache and library cache respectively are for. The dictionary cache (also known as the row cache) holds information about objects—such as tables, indexes, columns, histograms, triggers, and so forth—the library cache holds information about statements (SQL and PL/SQL), their cursors, and lots of information about object dependencies and privileges.

CURSORS

The change in the number of executions in the second test run was a demonstration of Oracle taking advantage of the dictionary cache. There was no need to run any SQL to find information about the index i_objauth1, for example, because that information had already been read from the data dictionary and copied into the dictionary cache during the first test run.

Similarly, the change in the number of library cache misses in the second test run was a demonstration of Oracle taking advantage of the library cache. Take a look at the closing lines of the two summaries I have dumped. The first one reports as follows:

Misses in library cache during parse: 15
Misses in library cache during execute: 15

The second one reports as follows:

Misses in library cache during parse: 0

A miss in the library cache tells us that Oracle went to the library cache to check for an existing (legal and valid) child cursor, failed to find one, and had to optimize a statement before it could use it. On the first test all 15 of the recursive SQL statements had to be optimized on their first parse call. On the second test run none of the 15 statements had to be optimized, the appropriate cursors were available in the library cache. The first test is showing a miss on every parse and every execute. Missing on execution is possible, but in this case I think 11g is showing a bug (see following note) as a 10053 trace showed that it only optimized the statements once in the course of the test.

images Note In the tkprof output, each statement reported should be followed by (up to) two lines stating the number of misses in the library cache during parse calls and execute calls. Between them these lines should tell you the total number of times the statement had to be optimized during the lifetime of the trace file. Unfortunately 11g seems to report a miss during execute immediately after the miss during parse—and I think (but obviously cannot prove) that this is a bug. In earlier versions of Oracle a quick check of the tkprof statistic Misses in library cache during execute was a useful way of spotting that optimized statements for held cursors were being lost because of demand for memory—in 11g I think you have to subtract the parse misses from the execute misses to get a true figure for the misses during execute.

We’ll carry on using this example to take a closer look at the both the library cache and the dictionary cache, both in terms of structure and the way that Oracle uses them. We’ll start with the dictionary cache.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.107.32