CHAPTER 14
Parallel Execution

Parallel execution, a feature of Oracle Enterprise Edition (it is not available in the Standard Edition), was first introduced in Oracle version 7.1.6 in 1994. It is the ability to physically break a large serial task (any DML, or DDL in general) into many smaller bits that may all be processed simultaneously. Parallel executions in Oracle mimic the real-life processes we see all of the time. Rarely would you expect to see a single individual build a house; it is far more common for many teams of people to work concurrently to rapidly assemble the house. In that way, certain operations can be divided into smaller tasks and performed concurrently. For example, the plumbing and electrical wiring can take place at the same time to reduce the total amount of time required for the job as a whole.

Parallel execution in Oracle follows much the same logic. It is often possible for Oracle to divide a certain large "job" into smaller parts and to perform each part concurrently. For example, if a full table scan of a large table is required, there is no reason why Oracle cannot have four parallel sessions, P001-P004, perform the full scan together, with each session reading a different portion of the table. If the data scanned by P001-P004 needs to be sorted, this could be carried out by four more parallel sessions, P005-P008, which could ultimately send the results to an overall coordinating session for the query.

Parallel execution is a tool that, when wielded properly, may increase the response time of certain operations by orders of magnitude. When it's wielded as a "fast = true" switch, the results are typically quite the opposite. In this chapter, the goal is not to explain precisely how parallel query is implemented in Oracle, the myriad combinations of plans that can result from parallel operations, and the like. I feel that much of that material is covered quite well in the Oracle Administrator's Guide, the Oracle Concepts Guide and, in particular, the Oracle Data Warehousing Guide. This chapter's goal is to give you an understanding of what class of problems parallel execution is and isn't appropriate for. Specifically, after looking at when to use parallel execution, we will cover the following:

  • Parallel query: The ability to perform a single query using many operating system processes or threads. Oracle will find operations it can perform in parallel, such as full table scans or large sorts, and create a query plan to do so.
  • Parallel DML (PDML): This is very similar in nature to parallel query, but it is used in reference to performing modifications (INSERT, UPDATE, DELETE, and MERGE) using parallel processing. In this chapter, we'll look at PDML and discuss some of the inherent limitations associated with it.
  • Parallel DDL: Parallel DDL is the ability of Oracle to perform large DDL operations in parallel. For example, an index rebuild, creation of a new index, loading of data, and reorganization of large tables may all use parallel processing. This, I believe, is the "sweet spot" for parallelism in the database, so we will focus most of the discussion on this topic.
  • Parallel recovery: This is the ability of the database to perform instance or even media recovery in parallel in an attempt to reduce the time it takes to recover from failures.
  • Procedural parallelism: This is the ability to run developed code in parallel. In this chapter, I'll discuss two approaches to this. In the first approach, Oracle runs our developed PL/SQL code in parallel in a fashion transparent to developers (developers are not developing parallel code; rather, Oracle is parallelizing their code for them transparently). The other approach is something I term "do-it-yourself parallelism," whereby the developed code is designed to be executed in parallel.

When to Use Parallel Execution

Parallel execution can be fantastic. It can allow you to take a process that executes over many hours or days and complete it in minutes. Breaking down a huge problem into small components may, in some cases, dramatically reduce the processing time. However, one underlying concept that will be useful to keep in mind while considering parallel execution is summarized by this very short quote from Practical Oracle8i: Building Efficient Databases (Addison-Wesley, 2001) by Jonathan Lewis:

PARALLEL QUERY option is essentially nonscalable.

Parallel execution is essentially a nonscalable solution. It was designed to allow an individual user or a particular SQL statement to consume all resources of a database. If you have a feature that allows an individual to make use of everything that is available, and then you allow two individuals to use that feature, you'll have obvious contention issues. As the number of concurrent users on your system begins to overwhelm the number of resources you have (memory, CPU, and I/O), the ability to deploy parallel operations becomes questionable. If you have a four-CPU machine, for example, and on average you have 32 users executing queries simultaneously, then the odds are that you do not want to parallelize their operations. If you allowed each user to perform just a "parallel 2" query, then you would now have 64 concurrent operations taking place on a machine with only four CPUs. If the machine were not overwhelmed before parallel execution, it almost certainly would be now.

In short, parallel execution can also be a terrible idea. In many cases, the application of parallel processing will only lead to increased resource consumption, as parallel execution attempts to use all available resources. In a system where resources must be shared by many concurrent transactions, such as an OLTP system, you would likely observe increased response times due to this. Oracle avoids certain execution techniques that it can use efficiently in a serial execution plan and adopts execution paths such as full scans in the hope that by performing many pieces of the larger, bulk operation in parallel, it would be better than the serial plan. Parallel execution, when applied inappropriately, may be the cause of your performance problem, not the solution for it.

So, before applying parallel execution, you need the following two things to be true:

  • You must have a very large task, such as the full scan of 50GB of data.
  • You must have sufficient available resources. Before parallel full scanning 50GB of data, you would want to make sure that there is sufficient free CPU (to accommodate the parallel processes) as well as sufficient I/O. The 50GB should be spread over more than one physical disk to allow for many concurrent read requests to happen simultaneously, there should be sufficient I/O channels from the disk to the computer to retrieve the data from disk in parallel, and so on.

If you have a small task, as generally typified by the queries carried out in an OLTP system, or you have insufficient available resources, again as is typical in an OLTP system where CPU and I/O resources are often already used to their maximum, then parallel execution is not something you'll want to consider.

A Parallel Processing Analogy

I often use an analogy to describe parallel processing and why you need both a large task and sufficient free resources in the database. It goes like this: suppose you have two tasks to complete. The first is to write a one-page summary of a new product. The other is to write a ten-chapter comprehensive report, with each chapter being very much independent of the others. For example, consider this book. This chapter, "Parallel Execution," is very much separate and distinct from the chapter titled "Redo and Undo"—they did not have to be written sequentially.

How do you approach each task? Which one do you think would benefit from parallel processing?

One-Page Summary

In this analogy, the one-page summary you have been assigned is not a large task. You would either do it yourself or assign it to a single individual. Why? Because the amount of work required to "parallelize" this process would exceed the work needed just to write the paper yourself. You would have to sit down, figure out that there should be 12 paragraphs, determine that each paragraph is not dependent on the other paragraphs, hold a team meeting, pick 12 individuals, explain to them the problem and assign each person a paragraph, act as the coordinator and collect all of their paragraphs, sequence them into the right order, verify they are correct, and then print the report. This is all likely to take longer than it would to just write the paper yourself, serially. The overhead of managing a large group of people on a project of this scale will far outweigh any gains to be had from having the 12 paragraphs written in parallel.

The exact same principle applies to parallel execution in the database. If you have a job that takes seconds or less to complete serially, then the introduction of parallel execution and its associated managerial overhead will likely make the entire thing take longer.

Ten-Chapter Report

Now let's examine the second task. If you want that ten-chapter report fast—as fast as possible—the slowest way to accomplish it would be to assign all of the work to a single individual (trust me, I know—look at this book! Some days I wished there were 15 of me working on it). Here you would hold the meeting, review the process, assign the work, act as the coordinator, collect the results, bind up the finished report, and deliver it. It would not have been done in one-tenth the time, but perhaps one-eighth or so. Again, I say this with the proviso that you have sufficient free resources. If you have a large staff that is currently not actually doing anything, then splitting up the work makes complete sense.

However, consider that as the manager, your staff is multitasking and they have a lot on their plates. In that case, you have to be careful with that big project. You need to be sure not to overwhelm your staff; you don't want to work them beyond the point of exhaustion. You can't delegate out more work than your resources (your people) can cope with, otherwise they'll quit. If your staff is already fully utilized, adding more work will cause all schedules to slip and all projects to be delayed.

Parallel execution in Oracle is very much the same. If you have a task that takes many minutes, hours, or days, then the introduction of parallel execution may be the thing that makes it run eight times faster. But then again, if you are already seriously low on resources (the overworked team of people), then the introduction of parallel execution would be something to avoid, as the system will become even more bogged down. While the Oracle server processes won't "quit" in protest, they could start running out of RAM and failing, or just suffer from such long waits for I/O or CPU as to make it appear as if they were doing no work whatsoever.

If you keep that in mind, remembering never to take an analogy to illogical extremes, you'll have the commonsense guiding rule to see if parallelism can be of some use. If you have a job that takes seconds, it is doubtful that parallel execution can be used to make it go faster—the converse would be more likely. If you are low on resources already (i.e., your resources are fully utilized), adding parallel execution would likely make things worse, not better. Parallel execution is excellent for when you have a really big job and plenty of excess capacity. In this chapter, we'll take a look at some of the ways we can exploit those resources.

Parallel Query

Parallel query allows a single SQL SELECT statement to be divided into many smaller queries, with each component query being run concurrently, and then the results from each combined to provide the final answer. For example, consider the following query:

big_table@ORA10G> select count(status) from big_table;

Using parallel query, this query could use some number of parallel sessions; break the BIG_TABLE into small, nonoverlapping slices; and ask each parallel session to read the table and count its section of rows. The parallel query coordinator for this session would then receive each of the aggregated counts from the individual parallel sessions and further aggregate them, returning the final answer to the client application. Graphically, it might look like Figure 14-1.

The P000, P001, P002, and P003 processes are known as parallel execution servers, sometimes also referred to as parallel query (PQ) slaves. Each of these parallel execution servers is a separate session connected as if it were a dedicated server process. Each one is responsible for scanning a nonoverlapping region of BIG_TABLE, aggregating their results subsets, and sending back their output to the coordinating server—the original session's server process—which will aggregate the subresults into the final answer.

image

Figure 14-1. Parallel select count (status) depiction

We can see this in an explain plan. Using a BIG_TABLE with 10 million rows in it, we'll walk through enabling a parallel query for that table and discover how we can "see" parallel query in action. This example was performed on a four-CPU machine with default values for all parallel parameters; that is, this is an out-of-the-box installation where only necessary para-meters were set, including SGA_TARGET (set to 1GB), CONTROL_FILES, DB_BLOCK_SIZE (set to 8KB), and PGA_AGGREGATE_TARGET (set to 512MB). Initially, we would expect to see the following plan:

big_table@ORA10GR1> explain plan for
  2  select count(status) from big_table;
Explained.

big_table@ORA10GR1> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------
Plan hash value: 1287793122
----------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows|Bytes  | Cost (%CPU)|Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   1 |    17 | 32390   (2)|00:06:29 |
|   1 |  SORT AGGREGATE    |           |   1 |    17 |            |         |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  10M|   162M| 32390   (2)|00:06:29 |
----------------------------------------------------------------------------

That is a typical serial plan. No parallelism is involved because we did not request parallel query to be enabled, and by default it will not be.

We may enable parallel query in a variety of ways, including use of a hint directly in the query or by altering the table to enable parallel execution paths to be considered (which is the option we use here).

We can specifically dictate the degree of parallelism to be considered in execution paths against this table. We can tell Oracle, "We would like you to use parallel degree 4 when creating execution plans against this table," for example:

big_table@ORA10GR1> alter table big_table parallel 4;
Table altered.

I prefer to just tell Oracle, "Please consider parallel execution, but you figure out the appropriate degree of parallelism based on the current system workload and the query itself." That is, let the degree of parallelism vary over time as the workload on the system increases and decreases. If we have plenty of free resources, the degree of parallelism will go up; in times of limited available resources, the degree of parallelism will go down. Rather than overload the machine with a fixed degree of parallelism, this approach allows Oracle to dynamically increase or decrease the amount of concurrent resources required by the query.

Therefore, we simply enable parallel query against this table via the following ALTER TABLE command:

big_table@ORA10GR1> alter table big_table parallel;
Table altered.

That is all there is to it—parallel query will now be considered for operations against this table. When we rerun the explain plan, this time we see the following:

big_table@ORA10GR1> explain plan for
  2  select count(status) from big_table;
Explained.

big_table@ORA10GR1> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------
Plan hash value: 1651916128
----------------------------------------------------------------------------
|Id | Operation              | Name     |Cost(%CPU)|  TQ |IN-OUT|PQ Distrib |
----------------------------------------------------------------------------
|  0| SELECT STATEMENT       |          | 4465  (1)|     |      |           |
|  1|  SORT AGGREGATE        |          |          |     |      |           |
|  2|   PX COORDINATOR       |          |          |     |      |           |
|  3|    PX SEND QC (RANDOM) | :TQ10000 |          |Q1,00| P->S |QC(RAND)   |
|  4|     SORT AGGREGATE     |          |          |Q1,00| PCWP |           |
|  5|      PX BLOCK ITERATOR |          | 4465  (1)|Q1,00| PCWC |           |
|  6|       TABLE ACCESS FULL| BIG_TABLE| 4465  (1)|Q1,00| PCWP |           |
----------------------------------------------------------------------------

Note The ROWS, BYTES, and TIME columns were removed from this plan output to allow it to fit on the page. The aggregate time for the query, however, was 00:00:54 as opposed to the previous estimate of 00:06:29 for the serial plan. Remember, these are estimates, not promises! Also, this is the plan output from Oracle 10g—the plan output from Oracle9i would have less detail (four steps instead of seven), but the net effect is identical.


If you read this plan from the bottom up, starting at ID=6, it shows the steps described in Figure 14-1. The full table scan would be split up into many "smaller scans" (step 5). Each of those would aggregate their COUNT(STATUS) values (step 4). These subresults would be transmitted to the parallel query coordinator (steps 2 and 3), which would aggregate these results further (step 1) and output the answer.

If you were to actually execute this query on a freshly started system (where no other parallel execution had been done), you might observe the following. Here we are using the Linux ps command to find processes that represent parallel query processes (we expect to find none), running the query with parallel execution enabled, and then looking for those processes again:

big_table@ORA10GR1> host ps -auxww | grep '^ora10gr1.*ora_p00._ora10g'

big_table@ORA10GR1> select count(status) from big_table;

COUNT(STATUS)
-------------
     10000000

big_table@ORA10GR1> host  ps -auxww | grep '^ora10gr1.*ora_p00._ora10g'
ora10gr1  3411 35.5  0.5 1129068 12200 ?  S    13:27   0:02 ora_p000_ora10gr1
ora10gr1  3413 28.0  0.5 1129064 12196 ?  S    13:27   0:01 ora_p001_ora10gr1
ora10gr1  3415 26.0  0.5 1129064 12196 ?  S    13:27   0:01 ora_p002_ora10gr1
ora10gr1  3417 23.3  0.5 1129044 12212 ?  S    13:27   0:01 ora_p003_ora10gr1
ora10gr1  3419 19.5  0.5 1129040 12228 ?  S    13:27   0:01 ora_p004_ora10gr1
ora10gr1  3421 19.1  0.5 1129056 12188 ?  S    13:27   0:01 ora_p005_ora10gr1
ora10gr1  3423 19.0  0.5 1129056 12164 ?  S    13:27   0:01 ora_p006_ora10gr1
ora10gr1  3425 21.6  0.5 1129048 12204 ?  S    13:27   0:01 ora_p007_ora10gr1

As we can see, there are now eight parallel execution servers that have been started by Oracle. If we are curious enough to want to "watch" parallel query, we can easily do so using two sessions. In the session we will run the parallel query in, we'll start by determining our SID:

big_table@ORA10GR1> select sid from v$mystat where rownum = 1;

       SID
----------
       162

Now, in another session, we get this query ready to run:

ops$tkyte@ORA10GR1> select sid, qcsid, server#, degree
  2  from v$px_session
  3  where qcsid = 162

Shortly after starting the parallel query in the session with SID=162, we come back to this second session and run the query:

 4 /

       SID      QCSID    SERVER#     DEGREE
---------- ---------- ---------- ----------
       145        162          1          8
       150        162          2          8
       147        162          3          8
       151        162          4          8
       146        162          5          8
       152        162          6          8
       143        162          7          8
       144        162          8          8
       162        162
9 rows selected.

We see here that our parallel query session (SID=162) is the query coordinator SID (QCSID) for nine rows in this dynamic performance view. Our session is "coordinating" or controlling these parallel query resources now. We can see each has its own SID; in fact, each is a separate Oracle session and shows up as such in V$SESSION during the execution of our parallel query:

ops$tkyte@ORA10GR1> select sid, username, program
  2  from v$session
  3  where sid in ( select sid
  4                   from v$px_session
  5                  where qcsid = 162 )
  6  /
       SID USERNAME                       PROGRAM
---------- ------------------------------ -------------------------------
       143 BIG_TABLE                      oracle@dellpe (P005)
       144 BIG_TABLE                      oracle@dellpe (P002)
       145 BIG_TABLE                      oracle@dellpe (P006)
       146 BIG_TABLE                      oracle@dellpe (P004)
       147 BIG_TABLE                      oracle@dellpe (P003)
       150 BIG_TABLE                      oracle@dellpe (P001)
       151 BIG_TABLE                      oracle@dellpe (P000)
       153 BIG_TABLE                      oracle@dellpe (P007)
       162 BIG_TABLE                      sqlplus@dellpe (TNS V1-V3)
9 rows selected.

Note If a parallel execution is not occurring in your system, do not expect to see the parallel execution servers in V$SESSION. They will be in V$PROCESS, but will not have a session established unless they are being used. The parallel execution servers will be connected to the database, but will not have a session established. See Chapter 5 for details on the difference between a session and a connection.


In a nutshell, that is how parallel query—and, in fact, parallel execution in general—works. It entails a series of parallel execution servers working in tandem to produce subresults that are fed either to other parallel execution servers for further processing or to the coordinator for the parallel query.

In this particular example, as depicted, we had BIG_TABLE spread across four separate devices, in a single tablespace (a tablespace with four data files). When implementing parallel execution, it is generally "optimal" to have your data spread over as many physical devices as possible. You can achieve this in a number of ways:

  • Using RAID striping across disks
  • Using ASM, with its built-in striping
  • Using partitioning to physically segregate BIG_TABLE over many disks
  • Using multiple data files in a single tablespace, thus allowing Oracle to allocate extents for the BIG_TABLE segment in many files

In general, parallel execution works best when given access to as many resources (CPU, memory, and I/O) as possible. However, that is not to say that nothing can be gained from parallel query if the entire set of data were on a single disk, but you would perhaps not gain as much as would be gained using multiple disks. The reason you would likely gain some speed in response time, even when using a single disk, is that when a given parallel execution server is counting rows it is not reading them, and vice versa. So, two parallel execution servers may well be able to complete the counting of all rows in less time than a serial plan would.

Likewise, you can benefit from parallel query even on a single CPU machine. It is doubtful that a serial SELECT COUNT(*) would use 100 percent of the CPU on a single CPU machine—it would be spending part of its time performing (and waiting for) physical I/O to disk. Parallel query would allow you to fully utilize the resources (the CPU and I/O, in this case) on the machine, whatever those resources may be.

That final point brings us back to the earlier quote from Practical Oracle8i: Building Efficient Databases: parallel query is essentially nonscalable. If you allowed four sessions to simultaneously perform queries with two parallel execution servers on that single CPU machine, you would probably find their response times to be longer than if they just processed serially. The more processes clamoring for a scarce resource, the longer it will take to satisfy all requests.

And remember, parallel query requires two things to be true. First, you need to have a large task to perform—for example, a long-running query, the runtime of which is measured in minutes, hours, or days, not in seconds or subseconds. This implies that parallel query is not a solution to be applied in a typical OLTP system, where you are not performing long-running tasks. Enabling parallel execution on these systems is often disastrous. Second, you need ample free resources such as CPU, I/O, and memory. If you are lacking in any of these, then parallel query may well push your utilization of that resource over the edge, negatively impacting overall performance and runtime.

In the past, parallel query was considered mandatory for many data warehouses simply because in the past (say, in 1995) data warehouses were rare and typically had a very small, focused user base. Today in 2005, data warehouses are literally everywhere and support user communities that are as large as those found for many transactional systems. This means that you may well not have sufficient free resources at any given point in time to enable parallel query on these systems. That doesn't mean parallel execution in general is not useful in this case—it just might be more of a DBA tool, as we'll see in the section "Parallel DDL," rather than a parallel query tool.

Parallel DML

The Oracle documentation limits the scope of the term DML (PDML) to include only INSERT, UPDATE, DELETE, and MERGE (it does not include SELECT as normal DML does). During PDML, Oracle may use many parallel execution servers to perform your INSERT, UPDATE, DELETE, or MERGE instead of a single serial process. On a multi-CPU machine with plenty of I/O bandwidth, the potential increase in speed may be large for mass DML operations.

However, you should not look to PDML as a feature to speed up your OLTP-based applications. As stated previously, parallel operations are designed to fully and totally maximize the utilization of a machine. They are designed so that a single user can completely use all of the disks, CPU, and memory on the machine. In certain data warehouses (with lots of data and few users), this is something you may want to achieve. In an OLTP system (with a lot of users all doing short, fast transactions), you do not want to give a user the ability to fully take over the machine resources.

This sounds contradictory: we use parallel query to scale up, so how could it not be scalable? When applied to an OLTP system, the statement is quite accurate. Parallel query is not something that scales up as the number of concurrent users increases. Parallel query was designed to allow a single session to generate as much work as 100 concurrent sessions would. In our OLTP system, we really do not want a single user to generate the work of 100 users.

PDML is useful in a large data warehousing environment to facilitate bulk updates to massive amounts of data. The PDML operation is executed in much the same way as a distributed query would be executed by Oracle, with each parallel execution server acting like a process in a separate database instance. Each slice of the table is modified by a separate thread with its own independent transaction (and hence its own undo segment, hopefully). After they are all done, the equivalent of a fast 2PC is performed to commit the separate, independent transactions. Figure 14-2 depicts a parallel update using four parallel execution servers. Each of the parallel execution servers has its own independent transaction, in which either all are committed with the PDML coordinating session or none commit.

We can actually observe the fact that there are separate independent transactions created for the parallel execution servers. We'll use two sessions again, as before. In the session with SID=162, we explicitly enable parallel DML. PDML differs from parallel query in that regard; unless you explicitly ask for it, you will not get it.

big_table@ORA10GR1> alter session enable parallel dml;
Session altered.
image

Figure 14-2. Parallel update (PDML) depiction

The fact that the table is "parallel" is not sufficient, as it was for parallel query. The reasoning behind the need to explicitly enable PDML in your session is the fact that PDML has certain limitations associated with it, which I list after this example.

In the same session, we do a bulk UPDATE that, because the table is "parallel enabled," will in fact be done in parallel:

big_table@ORA10GR1> update big_table set status = 'done';

In the other session, we'll join V$SESSION to V$TRANSACTION to show the active sessions for our PDML operation, as well as their independent transaction information:

ops$tkyte@ORA10GR1> select a.sid, a.program, b.start_time, b.used_ublk,
  2         b.xidusn ||'.'|| b.xidslot || '.' || b.xidsqn trans_id
  3    from v$session a, v$transaction b
  4   where a.taddr = b.addr
  5     and a.sid in ( select sid
  6                      from v$px_session
  7                     where qcsid = 162 )
  8   order by sid
  9  /
 SID PROGRAM                    START_TIME            USED_UBLK TRANS_ID
---- -------------------------- -------------------- ---------- ------------
 136 oracle@dellpe (P009)       08/03/05 14:28:17          6256 18.9.37
 137 oracle@dellpe (P013)       08/03/05 14:28:17          6369 21.39.225
 138 oracle@dellpe (P015)       08/03/05 14:28:17          6799 24.16.1175
 139 oracle@dellpe (P008)       08/03/05 14:28:17          6729 15.41.68
 140 oracle@dellpe (P014)       08/03/05 14:28:17          6462 22.41.444
 141 oracle@dellpe (P012)       08/03/05 14:28:17          6436 20.46.46
 142 oracle@dellpe (P010)       08/03/05 14:28:17          6607 19.44.37
 143 oracle@dellpe (P007)       08/03/05 14:28:17             1 17.12.46
 144 oracle@dellpe (P006)       08/03/05 14:28:17             1 13.25.302
 145 oracle@dellpe (P003)       08/03/05 14:28:17             1 1.21.1249
 146 oracle@dellpe (P002)       08/03/05 14:28:17             1 14.42.49
 147 oracle@dellpe (P005)       08/03/05 14:28:17             1 12.18.323
 150 oracle@dellpe (P011)       08/03/05 14:28:17          6699 23.2.565
 151 oracle@dellpe (P004)       08/03/05 14:28:17             1 16.26.46
 152 oracle@dellpe (P001)       08/03/05 14:28:17             1 11.13.336
 153 oracle@dellpe (P000)       08/03/05 14:28:17             1 2.29.1103
 162 sqlplus@dellpe (TNS V1-V3) 08/03/05 14:25:46             2 3.13.2697

 17 rows selected.

As we can see, there is more happening here than when we simply queried the table in parallel. We have 17 processes working on this operation, not just 9 as before. This is because the plan that was developed includes a step to update the table and independent steps to update the index entries. Looking at an edited explain plan output from DBMS_XPLAN (trailing columns were removed to permit the output to fit on the page), we see the following:

----------------------------------------------
| Id  | Operation                | Name      |
----------------------------------------------
|   0 | UPDATE STATEMENT         |           |
|   1 |  PX COORDINATOR          |           |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001  |
|   3 |    INDEX MAINTENANCE     | BIG_TABLE |
|   4 |     PX RECEIVE           |           |
|   5 |      PX SEND RANGE       | :TQ10000  |
|   6 |       UPDATE             | BIG_TABLE |
|   7 |        PX BLOCK ITERATOR |           |
|   8 |         TABLE ACCESS FULL| BIG_TABLE |
-----------------------------------------------

As a result of the pseudo-distributed implementation of PDML, certain limitations are associated with it:

  • Triggers are not supported during a PDML operation. This is a reasonable limitation in my opinion, since triggers would tend to add a large amount of overhead to the update, and you are using PDML to go fast—the two features don't go together.
  • There are certain declarative referential-integrity constraints that are not supported during the PDML, since each slice of the table is modified as a separate transaction in the separate session. Self-referential integrity is not supported, for example. Consider the deadlocks and other locking issues that would occur if it were supported.
  • You cannot access the table you've modified with PDML until you commit or roll back.
  • Advanced replication is not supported with PDML (because the implementation of advanced replication is trigger based).
  • Deferred constraints (i.e., constraints that are in the deferred mode) are not supported.
  • PDML may only be performed on tables that have bitmap indexes or LOB columns if the table is partitioned, and then the degree of parallelism would be capped at the number of partitions. You cannot parallelize an operation within partitions in this case, as each partition would get a single parallel execution server to operate on it.
  • Distributed transactions are not supported when performing PDML.
  • Clustered tables are not supported with PDML.

If you violate any of those restrictions, one of two things will happen: either the statement will be performed serially (no parallelism will be involved) or an error will be raised. For example, if you already performed the PDML against table T and then attempted to query table T before ending your transaction, then you will receive an error.

Parallel DDL

I believe that parallel DDL is the real "sweet spot" of Oracle's parallel technology. As we've discussed, parallel execution is generally not appropriate for OLTP systems. In fact, for many data warehouses, parallel query is becoming less and less of an option. It used to be that a data warehouse was built for a very small, focused user community—sometimes consisting of just one or two analysts. However, over the last decade I've watched them grow from small user communities to user communities of hundreds or thousands. Consider a data warehouse front-ended by a web-based application: it could be accessible to literally thousands or more users instantly.

But a DBA performing the large batch operations, perhaps during a maintenance window, is a different story. The DBA is still a single individual and he might have a huge machine with tons of computing resources available. The DBA has only "one thing to do": load this data, reorganize that table, rebuild that index. Without parallel execution, the DBA would be hard-pressed to really use the full capabilities of the hardware. With parallel execution, he can. The following SQL DDL commands permit "parallelization":

  • CREATE INDEX: Multiple parallel execution servers can scan the table, sort the data, and write the sorted segments out to the index structure.
  • CREATE TABLE AS SELECT: The query that executes the SELECT may be executed using parallel query, and the table load itself may be done in parallel.
  • ALTER INDEX REBUILD: The index structure may be rebuilt in parallel.
  • ALTER TABLE MOVE: A table may be moved in parallel.
  • ALTER TABLE SPLIT|COALESCE PARTITION: The individual table partitions may be split or coalesced in parallel.
  • ALTER INDEX SPLIT PARTITION: An index partition may be split in parallel.

The first four of these commands work for individual table/index partitions as well—that is, you may MOVE an individual partition of a table in parallel.

To me, parallel DDL is where the parallel execution in Oracle is of greatest measurable benefit. Sure, it can be used with parallel query to speed up certain long-running operations, but from a maintenance standpoint, and from an administration standpoint, parallel DDL is where the parallel operations affect us, DBAs and developers, the most. If you think of parallel query as being designed for the end user for the most part, then parallel DDL is designed for the DBA/developer.

Parallel DDL and Data Loading Using External Tables

One of my favorite new features in Oracle 9i is external tables, which are especially useful in the area of data loading. We'll cover data loading and external tables in some detail in the next chapter but, as a quick introduction, we'll take a brief look at these topics here to study the effects of parallel DDL on extent sizing and extent trimming.

External tables allow us to easily perform parallel direct path loads without thinking too hard about it. Oracle 7.1 gave us the ability to perform parallel direct path loads, whereby multiple sessions could write directly to the Oracle data files, bypassing the buffer cache entirely, bypassing undo for the table data, and perhaps even bypassing redo generation. This was accomplished via SQL*Loader. The DBA would have to script multiple SQL*Loader sessions, split the input data files to be loaded manually, determine the degree of parallelism, and coordinate all of the SQL*Loader processes. In short, it could be done, but it was hard.

With parallel DDL plus external tables, we have a parallel direct path load that is implemented via a simple CREATE TABLE AS SELECT or INSERT /*+ APPEND */. No more scripting, no more splitting of files, and no more coordinating the N number of scripts that would be running. In short, this combination provides pure ease of use, without a loss of performance.

Let's take a look at a simple example of this in action. We'll see shortly how to create an external table. We'll look at data loading with external tables in much more detail in the next chapter. For now, we'll use a "real" table to load another table from, much like many people do with staging tables in their data warehouse. The technique in short is as follows:

  1. Use some extract, transform, load (ETL) tool to create input files.
  2. Load these input files into staging tables.
  3. Load a new table using queries against these staging tables.

We'll use the same BIG_TABLE from earlier, which is parallel-enabled and contains 10 million records. We're going to join this table to a second table, USER_INFO, which contains OWNER-related information from the ALL_USERS dictionary view. The goal is to denormalize this information into a flat structure.

We'll start by creating the USER_INFO table, enabling it for parallel operations, and then gathering statistics on it:

big_table@ORA10GR1> create table user_info as select * from all_users;
Table created.

big_table@ORA10GR1> alter table user_info parallel;
Table altered.

big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' );
PL/SQL procedure successfully completed.

Now, we would like to parallel direct path load a new table with this information. The query we'll use is simply

create table new_table parallel
as
select a.*, b.user_id, b.created user_created
  from big_table a, user_info b
 where a.owner = b.username

The plan for that particular CREATE TABLE AS SELECT looked like this in Oracle 10g:

---------------------------------------------------------------------------
| Id  | Operation                | Name      |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT   |           |        |      |            |
|   1 |  PX COORDINATOR          |           |        |      |            |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001  |  Q1,01 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT        |           |  Q1,01 | PCWP |            |
|*  4 |     HASH JOIN            |           |  Q1,01 | PCWP |            |
|   5 |      PX RECEIVE          |           |  Q1,01 | PCWP |            |
|   6 |       PX SEND BROADCAST  | :TQ10000  |  Q1,00 | P->P | BROADCAST  |
|   7 |        PX BLOCK ITERATOR |           |  Q1,00 | PCWC |            |
|   8 |         TABLE ACCESS FULL| USER_INFO |  Q1,00 | PCWP |            |
|   9 |      PX BLOCK ITERATOR   |           |  Q1,01 | PCWC |            |
|  10 |       TABLE ACCESS FULL  | BIG_TABLE |  Q1,01 | PCWP |            |
---------------------------------------------------------------------------

If you look at the steps from 4 on down, that is the query (SELECT) component. The scan of BIG_TABLE and hash join to USER_INFO was performed in parallel, and each of the subresults was loaded into a portion of the table (step 3, the LOAD AS SELECT). After each of the parallel execution servers finishes its part of the join and load, it sends its results up to the query coordinator. In this case, the results simply indicated "success" or "failure," as the work had already been performed.

And that is all there is to it—parallel direct path loads made easy. The most important thing to consider with these operations is how space is used (or not used). Of particular importance is a side effect called extent trimming. I'd like to spend some time investigating that now.

Parallel DDL and Extent Trimming

Parallel DDL relies on direct path operations. That is, the data is not passed to the buffer cache to be written later; rather, an operation such as a CREATE TABLE AS SELECT will create new extents and write directly to them, and the data goes straight from the query to disk, in those newly allocated extents. Each parallel execution server performing its part of the CREATE images/U001.jpg TABLE AS SELECT will write to its own extent. The INSERT /*+ APPEND */ (a direct path insert) writes "above" a segment's HWM, and each parallel execution server will again write to its own set of extents, never sharing them with other parallel execution servers. Therefore, if you do a parallel CREATE TABLE AS SELECT and use four parallel execution servers to create the table, then you will have at least four extents—maybe more. But each of the parallel execution servers will allocate its own extent, write to it and, when it fills up, allocate another new extent. The parallel execution servers will never use an extent allocated by some other parallel execution server.

Figure 14-3 depicts this process. We have a CREATE TABLE NEW_TABLE AS SELECT being executed by four parallel execution servers. In the figure, each parallel execution server is represented by a different color (white, light gray, dark gray, or black). The boxes in the "disk drum" represent the extents that were created in some data file by this CREATE TABLE statement. Each extent is presented in one of the aforementioned four colors, for the simple reason that all of the data in any given extent was loaded by only one of the four parallel execution servers—P003 is depicted as having created and then loaded four of these extents. P000, on the other hand, is depicted as having five extents, and so on.

image

Figure 14-3. Parallel DDL extent allocation depiction

This sounds all right at first, but in a data warehouse environment, this can lead to "wastage" after a large load. Let's say you want to load 1,010MB of data (about 1GB), and you are using a tablespace with 100MB extents. You decide to use ten parallel execution servers to load this data. Each would start by allocating its own 100MB extent (there will be ten of them in all) and filling it up. Since each has 101MB of data to load, it would fill up its first extent and then proceed to allocate another 100MB extent, of which it would use 1MB. You now have 20 extents, 10 of which are full and 10 of which have 1MB each, and 990MB is "allocated but not used." This space could be used the next time you load, but right now you have 990MB of dead space. This is where extent trimming comes in. Oracle will attempt to take the last extent of each parallel execution server and "trim" it back to the smallest size possible.

Extent Trimming and Dictionary-Managed Tablespaces

If you are using legacy dictionary-managed tablespaces, then Oracle will be able to convert each of the 100MB extents that contain just 1MB of data into 1MB extents. Unfortunately, that would (in dictionary-managed tablespaces) tend to leave ten noncontiguous 99MB extents free, and since your allocation scheme was for 100MB extents, this 990MB of space would not be very useful! The next allocation of 100MB would likely not be able to use the existing space, since it would be 99MB of free space, followed by 1MB of allocated space, followed by 99MB of free space, and so on. We will not review the dictionary-managed approach further in this book.

Extent Trimming and Locally-Managed Tablespaces

Enter locally-managed tablespaces. Here we have two types: UNIFORM SIZE, whereby every extent in the tablespace is always precisely the same size, and AUTOALLOCATE, whereby Oracle decides how big each extent should be using an internal algorithm. Both of these approaches solve nicely the problem of the 99MB of free space, followed by 1MB of used space, followed by 99MB of free space, and so on, resulting in lots of free space that cannot be used. However, they each solve it very differently.

The UNIFORM SIZE approach obviates extent trimming from consideration all together. When you use UNIFORM SIZEs, Oracle cannot perform extent trimming. All extents are of that single size—none can be smaller (or larger) than that single size.

AUTOALLOCATE extents, on the other hand, do support extent trimming, but in an intelligent fashion. They use a few specific sizes of extents and have the ability to use space of different sizes—that is, the algorithm permits the use of all free space over time in the tablespace. Unlike the dictionary-managed tablespace, where if you request a 100MB extent, Oracle will fail the request if it can find only 99MB free extents (so close, yet so far), a locally-managed tablespace with AUTOALLOCATE extents can be more flexible. It may reduce the size of the request it was making in order to attempt to use all of the free space.

Let's now look at the differences between the two locally-managed tablespace approaches. To do that, we need a real-life example to work with. We'll set up an external table capable of being used in a parallel direct path load situation, which is something that we do frequently. Even if you are still using SQL*Loader to parallel direct path load data, this section applies entirely—you just have manual scripting to do to actually load the data. So, in order to investigate extent trimming, we need to set up our example load and then perform the loads under varying conditions and examine the results.

Setting Up

To get started, we need an external table. I've found time and time again that I have a legacy control file from SQL*Loader that I used to use to load data. One that looks like this, for example:

LOAD DATA
INFILE '/tmp/big_table.dat'
INTO TABLE big_table
REPLACE
FIELDS TERMINATED BY '|'
(
id, owner, object_name, subobject_name, object_id
,data_object_id, object_type, created, last_ddl_time
,timestamp, status, temporary, generated, secondary
)

We can convert this easily into an external table definition using SQL*Loader itself:

$ sqlldr big_table/big_table big_table.ctl external_table=generate_only
SQL*Loader: Release 10.1.0.3.0 - Production on Mon Jul 11 14:16:20 2005
Copyright (c) 1982, 2004, Oracle. All rights reserved.

Notice the parameter EXTERNAL_TABLE passed to SQL*Loader. It causes SQL*Loader in this case to not load data, but rather to generate a CREATE TABLE statement for us in the log file. This CREATE TABLE statement looked as follows (this is an abridged form; I've edited out repetitive elements to make the example smaller):

CREATE TABLE "SYS_SQLLDR_X_EXT_BIG_TABLE"
(
  "ID" NUMBER,
 ...
 "SECONDARY" VARCHAR2(1)
)
ORGANIZATION external
(
  TYPE oracle_loader
  DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
  ACCESS PARAMETERS
(
    RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
    BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'big_table.bad'
    LOGFILE 'big_table.log_xt'
    READSIZE 1048576
    FIELDS TERMINATED BY "|" LDRTRIM
    REJECT ROWS WITH ALL NULL FIELDS
   (
      "ID" CHAR(255)
        TERMINATED BY "|",
        ....
      "SECONDARY" CHAR(255)
TERMINATED BY "|"
    )
  )
  location
  (
    'big_table.dat'
  )
)REJECT LIMIT UNLIMITED

All we need to do is edit that a bit to name the external table the way we want; change the directories, perhaps; and so on:

ops$tkyte@ORA10GR1> create or replace directory my_dir as '/tmp/'
  2  /
Directory created.

And after that, all we need to do is actually create the table:

ops$tkyte@ORA10GR1> CREATE TABLE "BIG_TABLE_ET"
  2  (
  3    "ID" NUMBER,
...
 16    "SECONDARY" VARCHAR2(1)
 17  )
 18  ORGANIZATION external
 19  (
 20    TYPE oracle_loader
 21    DEFAULT DIRECTORY MY_DIR
 22    ACCESS PARAMETERS
 23    (
 24      RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
 25      READSIZE 1048576
 26      FIELDS TERMINATED BY "|" LDRTRIM
 27      REJECT ROWS WITH ALL NULL FIELDS
 28    )
 29    location
 30    (
 31      'big_table.dat'
 32    )
 33  )REJECT LIMIT UNLIMITED
 34  /
Table created.

Then we make this table parallel enabled. This is the magic step—this is what will facilitate an easy parallel direct path load:

ops$tkyte@ORA10GR1> alter table big_table_et PARALLEL;
Table altered.

Note The PARALLEL clause may also be used on the CREATE TABLE statement itself. Right after the REJECT LIMIT UNLIMITED, the keyword PARALLEL could have been added. I used the ALTER statement just to draw attention to the fact that the external table is, in fact, parallel enabled.


Extent Trimming with UNIFORM vs. AUTOALLOCATE Locally-Managed Tablespaces

That's all we need to do to set up the load component. Now, we would like to investigate how space is managed in a locally-managed tablespace (LMT) that uses UNIFORM extent sizes, compared to how space is managed in an LMT that AUTOALLOCATEs extents. In this case, we'll use 100MB extents. First we create LMT_UNIFORM, which uses uniform extent sizes:

ops$tkyte@ORA10GR1> create tablespace lmt_uniform
  2  datafile '/u03/ora10gr1/lmt_uniform.dbf' size 1048640K reuse
  3  autoextend on next 100m
  4  extent management local
  5  uniform size 100m;
Tablespace created.

Next, we create LMT_AUTO, which uses AUTOALLOCATE to determine extent sizes:

ops$tkyte@ORA10GR1> create tablespace lmt_auto
  2  datafile '/u03/ora10gr1/lmt_auto.dbf' size 1048640K reuse
  3  autoextend on next 100m
  4  extent management local
  5  autoallocate;
Tablespace created.

Each tablespace started with a 1GB data file (plus 64KB used by LMTs to manage the storage; it would be 128KB extra instead of 64KB if we were to use a 32KB blocksize). We permit these data files to autoextend 100MB at a time. We are going to load this file:

$ ls -lag big_table.dat
-rw-rw-r--    1 tkyte    1067107251 Jul 11 13:46 big_table.dat

which is a 10,000,000-record file. It was created using the big_table.sql script found in the "Setting Up" section at the beginning of this book and then unloaded using the flat.sql script available at http://asktom.oracle.com/~tkyte/flat/index.html. Next, we do a parallel direct path load of this file into each tablespace:

ops$tkyte@ORA10GR1> create table uniform_test
  2  parallel
  3  tablespace lmt_uniform
  4  as
  5  select * from big_table_et;
Table created.

ops$tkyte@ORA10GR1> create table autoallocate_test
  2  parallel
  3  tablespace lmt_auto
  4  as
  5  select * from big_table_et;
Table created.

On my system, which has four CPUs, these CREATE TABLE statements executed with eight parallel execution servers and one coordinator. I verified that was the case by querying one of the dynamic performance views related to parallel execution, V$PX_SESSION, while these statements were running:

sys@ORA10GR1> select sid, serial#, qcsid, qcserial#, degree
  2  from v$px_session;

       SID    SERIAL#      QCSID  QCSERIAL#     DEGREE
---------- ---------- ---------- ---------- ----------
       137         17        154        998          8
       139         13        154        998          8
       141         17        154        998          8
       150        945        154        998          8
       161        836        154        998          8
       138          8        154        998          8
       147         15        154        998          8
       143         41        154        998          8
       154        998        154

9 rows selected.

Note In creating the UNIFORM_TEST and AUTOALLOCATE_TEST tables, we simply specified "parallel" on each table, with Oracle choosing the degree of parallelism. In this case, I was the sole user of the machine (all resources available) and Oracle defaulted it to 8 based on the number of CPUs (four) and the PARALLEL_THREADS_PER_CPU parameter setting, which defaults to 2.


The SID,SERIAL# are the identifiers of the parallel execution sessions, and the QCSID,QCSERIAL# is the identifier of the query coordinator of the parallel execution. So, with eight parallel execution sessions running, we would like to see how the space was used. A quick query against USER_SEGMENTS gives us a good idea:

ops$tkyte@ORA10GR1> select segment_name, blocks, extents
  2  from user_segments
  3 where segment_name in ( 'UNIFORM_TEST', 'AUTOALLOCATE_TEST' );

SEGMENT_NAME        BLOCKS    EXTENTS
--------------- ---------- ----------
UNIFORM_TEST        204800         16
AUTOALLOCATE_TEST   145592        714

Since we were using an 8KB blocksize, that shows a difference of about 462MB, or looking at it from ratio perspective, AUTOALLOCATE_TEST is about 70 percent the size of UNIFORM_TEST as far as allocated space goes. If we look at the actual used space,

ops$tkyte@ORA10GR1> exec show_space('UNIFORM_TEST' );
Free Blocks.............................          59,224
Total Blocks............................         204,800
Total Bytes.............................   1,677,721,600
Total MBytes............................           1,600
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               6
Last Used Ext BlockId...................               9
Last Used Block.........................          12,800
PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR1> exec show_space('AUTOALLOCATE_TEST' );
Free Blocks.............................              16
Total Blocks............................         145,592
Total Bytes.............................   1,192,689,664
Total MBytes............................           1,137
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               8
Last Used Ext BlockId...................              41
Last Used Block.........................               8
PL/SQL procedure successfully completed.

Note The SHOW_SPACE procedure is described in the "Setting Up" section at the beginning of this book.


we can see that if we take away the blocks on the freelist for UNIFORM_TEST—59,224 of them—the tables consume about the same amount of actual space, but the amount of space needed by the UNIFORM tablespace is considerably more. This is all due to the extent trimming that did not take place. It we look at UNIFORM_TEST, we see this clearly:

ops$tkyte@ORA10GR1> select segment_name, extent_id, blocks
  2  from user_extents where segment_name = 'UNIFORM_TEST';

SEGMENT_NAME     EXTENT_ID     BLOCKS
--------------- ---------- ----------
UNIFORM_TEST             0      12800
UNIFORM_TEST             1      12800
UNIFORM_TEST             2      12800
UNIFORM_TEST             3      12800
UNIFORM_TEST             4      12800
UNIFORM_TEST             5      12800
UNIFORM_TEST             6      12800
UNIFORM_TEST             7      12800
UNIFORM_TEST             8      12800
UNIFORM_TEST             9      12800
UNIFORM_TEST            10      12800
UNIFORM_TEST            11      12800
UNIFORM_TEST            12      12800
UNIFORM_TEST            13      12800
UNIFORM_TEST            14      12800
UNIFORM_TEST            15      12800

16 rows selected.

Each extent is 100MB in size. Now, it would be a waste of paper to list all 714 extents, so let's look at them in aggregate:

ops$tkyte@ORA10GR1> select segment_name, blocks, count(*)
  2  from user_extents
  3  where segment_name = 'AUTOALLOCATE_TEST'
  4  group by segment_name, blocks
  5  /

SEGMENT_NAME          BLOCKS   COUNT(*)
----------------- ---------- ----------
AUTOALLOCATE_TEST          8        128
AUTOALLOCATE_TEST        128        504
AUTOALLOCATE_TEST        240          1
AUTOALLOCATE_TEST        392          1
AUTOALLOCATE_TEST        512          1
AUTOALLOCATE_TEST        656          1
AUTOALLOCATE_TEST        752          5
AUTOALLOCATE_TEST        768          1
AUTOALLOCATE_TEST       1024         72

9 rows selected.

This generally fits in with how LMTs with AUTOALLOCATE are observed to allocate space. The 8, 128, and 1,024 block extents are "normal"; we will observe them all of the time with AUTOALLOCATE. The rest, however, are not "normal"; we do not usually observe them. They are due to the extent trimming that takes place. Some of the parallel execution servers finished their part of the load—they took their last 8MB (1,024 blocks) extent and trimmed it, resulting in a spare bit left over. One of the other parallel execution sessions, as it needed space, could use this spare bit. In turn, as these other parallel execution sessions finished processing their own loads, they would trim their last extent and leave spare bits of space.

Which approach should you use? If your goal is to direct path load in parallel as often as possible, I suggest AUTOALLOCATE as your extent management policy. Parallel direct path operations like this will not use use space under the object's HWM—the space on the freelist. Unless you do some conventional path inserts into these tables also, UNIFORM allocation will permanently have additional free space in it that it will never use. Unless you can size the extents for the UNIFORM LMT to be much smaller, you will see what I term excessive wastage over time—and remember that this space is associated with the segment and will be included in a full scan of the table.

To demonstrate this, let's do another parallel direct path load into these existing tables, using the same inputs:

ops$tkyte@ORA10GR1> alter session enable parallel dml;
Session altered.

ops$tkyte@ORA10GR1> insert /*+ append */ into UNIFORM_TEST
  2 select * from big_table_et;
10000000 rows created.

ops$tkyte@ORA10GR1> insert /*+ append */ into AUTOALLOCATE_TEST
  2 select * from big_table_et;
10000000 rows created.

ops$tkyte@ORA10GR1> commit;
Commit complete.

If we compare the space utilization of the two tables after that operation as follows:

ops$tkyte@ORA10GR1> exec show_space( 'UNIFORM_TEST' );
Free Blocks.............................         118,463
Total Blocks............................         409,600
Total Bytes.............................   3,355,443,200
Total MBytes............................           3,200
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               6
Last Used Ext BlockId...................         281,609
Last Used Block.........................          12,800

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR1> exec show_space( 'AUTOALLOCATE_TEST' );
Free Blocks.............................              48
Total Blocks............................         291,184
Total Bytes.............................   2,385,379,328
Total MBytes............................           2,274
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               8
Last Used Ext BlockId...................         140,025
Last Used Block.........................               8

PL/SQL procedure successfully completed.

we can see that as we load more and more data into the table UNIFORM_TEST using parallel direct path operations, the space utilization gets worse over time. We would want to use a significantly smaller uniform extent size or use the AUTOALLOCATE. The AUTOALLOCATE may well generate more extents over time, but the space utilization is superior due to the extent trimming that takes place.

Parallel Recovery

Another form of parallel execution in Oracle is the ability to perform parallel recovery. Parallel recovery may be performed at the instance level, perhaps by increasing the speed of a recovery that needs to be performed after a software, operating system, or general system failure. Parallel recovery may also be applied during media recovery (e.g., restoration from backups). It is not my goal to cover recovery-related topics in this book, so I'll just mention the existence of parallel recovery in passing. I recommend the following Oracle manuals for further reading on the topic:

  • Oracle Backup and Recovery Basics for information regarding parallel media recovery
  • Oracle Performance Tuning Guide for information regarding parallel instance recovery

Procedural Parallelism

I would like to discuss two types of procedural parallelism:

  • Parallel pipelined functions, which is a feature of Oracle.
  • "Do-it-yourself (DIY) parallelism," which is the application to your own applications of the same techniques that Oracle applies to parallel full table scans. DIY parallelism is more of a development technique than anything built into Oracle directly.

Many times you'll find that applications—typically batch processes—designed to execute serially will look something like the following procedure:

Create procedure process_data
As
Begin
For x in ( select * from some_table )
   Perform complex process on X
   Update some other table, or insert the record somewhere else
End loop
end

In this case, Oracle's parallel query or PDML won't help a bit (in fact, parallel execution of the SQL by Oracle here would likely only cause the database to consume more resources and take longer). If Oracle were to execute the simple SELECT * FROM SOME_TABLE in parallel, it would provide this algorithm no apparent increase in speed whatsoever. If Oracle were to perform in parallel the UPDATE or INSERT after the complex process, it would have no positive affect (it is a single-row UPDATE/INSERT, after all).

There is one obvious thing you could do here: use array processing for the UPDATE/INSERT after the complex process. However, that isn't going to give you a 50 percent reduction or more in runtime, and often that is what you are looking for. Don't get me wrong, you definitely want to implement array processing for the modifications here, but it won't make this process run two, three, four, or more times faster.

Now, suppose this process runs at night on a machine with four CPUs, and it is the only activity taking place. You have observed that only one CPU is partially used on this system, and the disk system is not being used very much at all. Further, this process is taking hours, and every day it takes a little longer as more data is added. You need to reduce the runtime by many times—it needs to run four or eight times faster—so incremental percentage increases will not be sufficient. What can you do?

There are two approaches you can take. One approach is to implement a parallel pipe-lined function, whereby Oracle will decide on appropriate degrees of parallelism (assuming you have opted for that, which is recommended). Oracle will create the sessions, coordinate them, and run them, very much like the previous example with parallel DDL where, by using CREATE TABLE AS SELECT OR INSERT /*+APPEND*/, Oracle fully automated parallel direct path loads for us. The other approach is DIY parallelism. We'll take a look at both approaches in the sections that follow.

Parallel Pipelined Functions

We'd like to take that very serial process PROCESS_DATA from earlier and have Oracle execute it in parallel for us. To accomplish this, we need to turn the routine "inside out." Instead of selecting rows from some table, processing them, and inserting them into another table, we will insert into another table the results of fetching some rows and processing them. We will remove the INSERT at the bottom of that loop and replace it in the code with a PIPE ROW clause. The PIPE ROW clause allows our PL/SQL routine to generate table data as its output, so we'll be able to SELECT from our PL/SQL process. The PL/SQL routine that used to procedurally process the data becomes a table, in effect, and the rows we fetch and process are the outputs. We've seen this many times throughout this book every time we've issued the following:

Select * from table(dbms_xplan.display);

That is a PL/SQL routine that reads the PLAN_TABLE; restructures the output, even to the extent of adding rows; and then outputs this data using PIPE ROW to send it back to the client. We're going to do the same thing here in effect, but we'll allow for it to be processed in parallel.

We're going to use two tables in this example: T1 and T2. T1 is the table we were reading previously, and T2 is the table we need to move this information into. Assume this is some sort of ETL process we run to take the transactional data from the day and convert it into reporting information for tomorrow. The two tables we'll use are as follows:

ops$tkyte-ORA10G> create table t1
  2  as
  3  select object_id id, object_name text
  4    from all_objects;
Table created.

ops$tkyte-ORA10G> begin
  2      dbms_stats.set_table_stats
( user, 'T1', numrows=>10000000,numblks=>100000 );
  4  end;
  5  /
PL/SQL procedure successfully completed.

ops$tkyte-ORA10G> create table t2
  2  as
  3  select t1.*, 0 session_id
  4    from t1
  5   where 1=0;
Table created.

We used DBMS_STATS to "trick" the optimizer into thinking that there are 10,000,000 rows in that input table and that it consumes 100,000 database blocks. We want to simulate a big table here. The second table, T2, is simply a copy of the first table's structure with the addition of a SESSION_ID column. That column will be useful to actually "see" the parallelism that takes place.

Next, we need to set up object types for our pipelined function to return. The object type is simply a structural definition of the "output" of the procedure we are converting. In this case, it looks just like T2:

ops$tkyte-ORA10G> CREATE OR REPLACE TYPE t2_type
  2  AS OBJECT (
  3   id         number,
  4   text       varchar2(30),
  5   session_id number
  6  )
  7  /
Type created.

ops$tkyte-ORA10G> create or replace type t2_tab_type
  2  as table of t2_type
  3  /
Type created.

And now for the pipelined function, which is simply the original PROCESS_DATA procedure rewritten. The procedure is now a function that produces rows. It accepts as an input the data to process in a ref cursor. The function returns a T2_TAB_TYPE, the type we just created. It is a pipelined function that is PARALLEL_ENABLED. The partition clause we are using says to Oracle, "Partition, or slice up, the data by any means that work best. We don't need to make any assumptions about the order of the data."

You may also use hash or range partitioning on a specific column in the ref cursor. That would involve using a strongly typed ref cursor, so the compiler knows what columns are available. Hash partitioning would just send equal rows to each parallel execution server to process based on a hash of the column supplied. Range partitioning would send nonoverlapping ranges of data to each parallel execution server, based on the partitioning key. For example, if you range partitioned on ID, each parallel execution server might get ranges 1...1000, 1001...20000, 20001...30000, and so on (ID values in that range).

Here, we just want the data split up. How the data is split up is not relevant to our processing, so our definition looks like this:

ops$tkyte-ORA10G> create or replace
  2  function parallel_pipelined( l_cursor in sys_refcursor )
  3  return t2_tab_type
  4  pipelined
  5  parallel_enable ( partition l_cursor by any )

We'd like to be able to see what rows were processed by which parallel execution servers, so we'll declare a local variable L_SESSION_ID and initialize it from V$MYSTAT:

  6
  7  is
  8      l_session_id number;
  9      l_rec        t1%rowtype;
 10  begin
 11      select sid into l_session_id
 12        from v$mystat
 13       where rownum =1;

Now we are ready to process the data. We simply fetch out a row (or rows, as we could certainly use BULK COLLECT here to array process the ref cursor), perform our complex process on it, and pipe it out. When the ref cursor is exhausted of data, we close the cursor and return:

 14      loop
 15          fetch l_cursor into l_rec;
 16          exit when l_cursor%notfound;
 17          -- complex process here
 18          pipe row(t2_type(l_rec.id,l_rec.text,l_session_id));
 19      end loop;
 20      close l_cursor;
 21      return;
 22  end;
 23  /
Function created.

And that's it. We're ready to process the data in parallel, letting Oracle figure out based on the resources available what the most appropriate degree of parallelism is:

ops$tkyte-ORA10G> alter session enable parallel dml;
Session altered.

ops$tkyte-ORA10G> insert /*+ append */
  2  into t2(id,text,session_id)
  3  select *
  4  from table(parallel_pipelined
  5            (CURSOR(select /*+ parallel(t1) */ *
  6                      from t1 )
  7             ))
  8  /
48250 rows created.

ops$tkyte-ORA10G> commit;
Commit complete.

Just to see what happened here, we can query the newly inserted data out and group by SESSION_ID to see first how many parallel execution servers were used, and second how many rows each processed:

ops$tkyte-ORA10G> select session_id, count(*)
  2    from t2
  3   group by session_id;

SESSION_ID    COUNT(*)
---------- ----------
       241       8040
       246       8045
       253       8042
       254       8042
       258       8040
       260       8041
6 rows selected.

Apparently, we used six parallel execution servers for the SELECT component of this parallel operation, and each one processed about 8,040 records each.

As you can see, Oracle parallelized our process, but we underwent a fairly radical rewrite of our process. This is a long way from the original implementation. So, while Oracle can process our routine in parallel, we may well not have any routines that are coded to be parallelized. If a rather large rewrite of your procedure is not feasible, you may be interested in the next implementation: DIY parallelism.

Do-It-Yourself Parallelism

Say we have that same process as in the preceding section: the serial, simple procedure. We cannot afford a rather extensive rewrite of the implementation, but we would like to execute it in parallel. What can we do? My approach many times has been to use rowid ranges to break up the table into some number of ranges that don't overlap (yet completely cover the table).

This is very similar to how Oracle performs a parallel query conceptually. If you think of a full table scan, Oracle processes that by coming up with some method to break the table into many "small" tables, each of which is processed by a parallel execution server. We are going to do the same thing using rowid ranges. In early releases, Oracle's parallel implementation actually used rowid ranges itself.

We'll use a BIG_TABLE of 1,000,000 rows, as this technique works best on big tables with lots of extents, and the method I use for creating rowid ranges depends on extent boundaries. The more extents used, the better the data distribution. So, after creating the BIG_TABLE with 1,000,000 rows, we'll create T2 like this:

big_table-ORA10G> create table t2
  2  as
  3  select object_id id, object_name text, 0 session_id
  4    from big_table
  5   where 1=0;
Table created.

We are going to use the job queues built into the database to parallel process our procedure. We will schedule some number of jobs. Each job is our procedure slightly modified to just process the rows in a given rowid range.


Note In Oracle 10g, you could use the scheduler for something this simple, but in order to make the example 9i compatible, we'll use the job queues here.


To efficiently support the job queues, we'll use a parameter table to pass inputs to our jobs:

big_table-ORA10G> create table job_parms
  2  ( job        number primary key,
  3    lo_rid rowid,
  4    hi_rid rowid
  5  )
  6  /
Table created.

This will allow us to just pass the job ID into our procedure, so it can query this table to get the rowid range it is to process. Now for our procedure. The code in bold is the new code we'll be adding:

big_table-ORA10G> create or replace
  2  procedure serial( p_job in number )
  3  is
  4      l_rec        job_parms%rowtype;
  5  begin
  6      select * into l_rec
  7        from job_parms
  8      where job = p_job;
  9
 10     for x in ( select object_id id, object_name text
 11                  from big_table
 12                 where rowid between l_rec.lo_rid
 13                                 and l_rec.hi_rid )
 14     loop
 15         -- complex process here
 16         insert into t2 (id, text, session_id )
 17         values ( x.id, x.text, p_job );
 18     end loop;
 19
 20     delete from job_parms where job = p_job;
 21     commit;
 22  end;
 23  /
Procedure created.

As you can see, it is not a significant change. Most of the added code was simply to get our inputs and the rowid range to process. The only change to our logic was the addition of the predicate on lines 12 and 13.

Now let's schedule our job. We'll use a rather complex query using analytics to divide the table. The innermost query on lines 19 through 26 breaks the data into eight groups in this case. The first sum on line 22 is computing a running total of the sum of blocks; the second sum on line 23 is the total number of blocks. If we integer divide the running total by the desired "chunk size" (the total size divided by 8 in this case), we can create groups of files/blocks that cover about the same amount of data. The query on lines 8 through 28 finds the high and low file numbers and block numbers by GRP, and returns the distinct entries. It builds the inputs we can then send to DBMS_ROWID to create the rowids Oracle wants. We take that output and, using DBMS_JOB, submit a job to process the rowid range:

big_table-ORA10G> declare
  2          l_job number;
  3  begin
  4  for x in (
  5  select dbms_rowid.rowid_create
            ( 1, data_object_id, lo_fno, lo_block, 0 ) min_rid,
  6         dbms_rowid.rowid_create
            ( 1, data_object_id, hi_fno, hi_block, 10000 ) max_rid
  7    from (
  8  select distinct grp,
  9         first_value(relative_fno)
              over (partition by grp order by relative_fno, block_id
 10           rows between unbounded preceding and unbounded following) lo_fno,
 11         first_value(block_id  )
              over (partition by grp order by relative_fno, block_id
 12           rows between unbounded preceding and unbounded following) lo_block,
 13         last_value(relative_fno)
              over (partition by grp order by relative_fno, block_id
 14           rows between unbounded preceding and unbounded following) hi_fno,
 15         last_value(block_id+blocks-1)
              over (partition by grp order by relative_fno, block_id
 16           rows between unbounded preceding and unbounded following) hi_block,
 17         sum(blocks) over (partition by grp) sum_blocks
 18    from (
 19  select relative_fno,
 20         block_id,
 21         blocks,
 22         trunc( (sum(blocks) over (order by relative_fno, block_id)-0.01) /
 23                (sum(blocks) over ()/8) ) grp
 24    from dba_extents
 25   where segment_name = upper('BIG_TABLE')
 26     and owner = user order by block_id
 27         )
 28         ),
 29         (select data_object_id
               from user_objects where object_name = upper('BIG_TABLE') )
 30  )
 31  loop
 32          dbms_job.submit( l_job, 'serial(JOB);' );
 33          insert into job_parms(job, lo_rid, hi_rid)
 34          values ( l_job, x.min_rid, x.max_rid );
 35  end loop;
 36  end;
 37  /
PL/SQL procedure successfully completed.

That PL/SQL block would have scheduled up to eight jobs for us (fewer if the table could not be broken in to eight pieces due to insufficient extents or size). We can see how many jobs were scheduled and what their inputs are as follows:

big_table-ORA10G> select * from job_parms;

       JOB LO_RID             HI_RID
---------- ------------------ ------------------
       172 AAAT7tAAEAAAAkpAAA AAAT7tAAEAAABQICcQ
       173 AAAT7tAAEAAABQJAAA AAAT7tAAEAAABwICcQ
       174 AAAT7tAAEAAABwJAAA AAAT7tAAEAAACUICcQ
       175 AAAT7tAAEAAACUJAAA AAAT7tAAEAAAC0ICcQ
       176 AAAT7tAAEAAAC0JAAA AAAT7tAAEAAADMICcQ
       177 AAAT7tAAEAAADaJAAA AAAT7tAAEAAAD6ICcQ
       178 AAAT7tAAEAAAD6JAAA AAAT7tAAEAAAEaICcQ
       179 AAAT7tAAEAAAEaJAAA AAAT7tAAEAAAF4ICcQ
8 rows selected.

big_table-ORA10G> commit;
Commit complete.

That commit released our jobs for processing. We have JOB_QUEUE_PROCESSES set to 0 in the parameter file, so all eight started running and shortly finished. The results are as follows:

big_table-ORA10G> select session_id, count(*)
  2    from t2
  3   group by session_id;

SESSION_ID   COUNT(*)
---------- ----------
       172    130055
       173    130978
       174    130925
       175    129863
       176    106154
       177    140772
       178    140778
       179    90475
8 rows selected.

It's not as evenly distributed as the Oracle built-in parallelism in this case, but it's pretty good. If you recall, earlier you saw how many rows were processed by each parallel execution server and, using the built-in parallelism, the row counts were very close to each other (they were off only by one or two). Here we had a job that processed as few as 90,475 rows and one that processed as many as 140,778. Most of them processed about 130,000 rows in this case.

Suppose, however, that you do not want to use the rowid processing—perhaps the query is not as simple as SELECT * FROM T and involves joins and other constructs that make using the rowid impractical. You can use the primary key of some table instead. For example, say you want to break that same BIG_TABLE into ten pieces to be processed concurrently by primary key. You can do that easily using the NTILE built-in analytic function. The process is rather straightforward:

big_table-ORA10G> select nt, min(id), max(id), count(*)
  2    from (
  3  select id, ntile(10) over (order by id) nt
  4    from big_table
  5         )
  6   group by nt;

        NT    MIN(ID)    MAX(ID)   COUNT(*)
---------- ---------- ---------- ----------
         1          1     100000     100000
         2     100001     200000     100000
         3     200001     300000     100000
         4     300001     400000     100000
         5     400001     500000     100000
         6     500001     600000     100000
         7     600001     700000     100000
         8     700001     800000     100000
         9     800001     900000     100000
        10     900001    1000000     100000

10 rows selected.

Now you have ten nonoverlapping primary key ranges—all of nice, equal size—that you can use to implement the same DBMS_JOB technique as shown earlier to parallelize your process.

Summary

In this chapter, we explored the concept of parallel execution in Oracle. I started by presenting an analogy to help frame where and when parallel execution is applicable, namely when we have long-running statements or procedures and plenty of available resources.

Then we looked at how Oracle can employ parallelism. We started with parallel query and how Oracle can break large serial operations, such as a full scan, into smaller pieces that can run concurrently. We moved on to parallel DML (PDML) and covered the rather extensive list of restrictions that accompany it.

Then we looked at the sweet spot for parallel operations: parallel DDL. Parallel DDL is a tool for the DBA and developer alike to quickly perform those large maintenance operations typically done during off-peak times when resources are available. We briefly touched on the fact that Oracle provides parallel recovery before we moved on to discuss procedural parallelism. Here we saw two techniques for parallelizing our procedures: one where Oracle does it and the other where we do it ourselves.

If we're designing a process from scratch, we might well consider designing it to allow Oracle to parallelize it for us, as the future addition or reduction of resources would easily permit the degree of parallelism to vary. However, if we have existing code that needs to quickly be "fixed" to be parallel, we may opt for do-it-yourself (DIY) parallelism, which we covered by examining two techniques, rowid ranges and primary key ranges, both of which use DBMS_JOB to carry out the job in parallel in the background for us.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.9.141