Parallel execution, a feature of Oracle Enterprise Edition (it is not available in the Standard Edition), was first introduced in Oracle version 7.1.6 in 1994. It is the ability to physically break a large serial task (any DML, or DDL in general) into many smaller bits that may all be processed simultaneously. Parallel executions in Oracle mimic the real-life processes we see all of the time. Rarely would you expect to see a single individual build a house; it is far more common for many teams of people to work concurrently to rapidly assemble the house. In that way, certain operations can be divided into smaller tasks and performed concurrently. For example, the plumbing and electrical wiring can take place at the same time to reduce the total amount of time required for the job as a whole.
Parallel execution in Oracle follows much the same logic. It is often possible for Oracle to divide a certain large "job" into smaller parts and to perform each part concurrently. For example, if a full table scan of a large table is required, there is no reason why Oracle cannot have four parallel sessions, P001-P004
, perform the full scan together, with each session reading a different portion of the table. If the data scanned by P001-P004
needs to be sorted, this could be carried out by four more parallel sessions, P005-P008
, which could ultimately send the results to an overall coordinating session for the query.
Parallel execution is a tool that, when wielded properly, may increase the response time of certain operations by orders of magnitude. When it's wielded as a "fast =
true" switch, the results are typically quite the opposite. In this chapter, the goal is not to explain precisely how parallel query is implemented in Oracle, the myriad combinations of plans that can result from parallel operations, and the like. I feel that much of that material is covered quite well in the Oracle Administrator's Guide, the Oracle Concepts Guide and, in particular, the Oracle Data Warehousing Guide. This chapter's goal is to give you an understanding of what class of problems parallel execution is and isn't appropriate for. Specifically, after looking at when to use parallel execution, we will cover the following:
INSERT
, UPDATE
, DELETE
, and MERGE
) using parallel processing. In this chapter, we'll look at PDML and discuss some of the inherent limitations associated with it.Parallel execution can be fantastic. It can allow you to take a process that executes over many hours or days and complete it in minutes. Breaking down a huge problem into small components may, in some cases, dramatically reduce the processing time. However, one underlying concept that will be useful to keep in mind while considering parallel execution is summarized by this very short quote from Practical Oracle8i: Building Efficient Databases (Addison-Wesley, 2001) by Jonathan Lewis:
PARALLEL QUERY option is essentially nonscalable.
Parallel execution is essentially a nonscalable solution. It was designed to allow an individual user or a particular SQL statement to consume all resources of a database. If you have a feature that allows an individual to make use of everything that is available, and then you allow two individuals to use that feature, you'll have obvious contention issues. As the number of concurrent users on your system begins to overwhelm the number of resources you have (memory, CPU, and I/O), the ability to deploy parallel operations becomes questionable. If you have a four-CPU machine, for example, and on average you have 32 users executing queries simultaneously, then the odds are that you do not want to parallelize their operations. If you allowed each user to perform just a "parallel 2" query, then you would now have 64 concurrent operations taking place on a machine with only four CPUs. If the machine were not overwhelmed before parallel execution, it almost certainly would be now.
In short, parallel execution can also be a terrible idea. In many cases, the application of parallel processing will only lead to increased resource consumption, as parallel execution attempts to use all available resources. In a system where resources must be shared by many concurrent transactions, such as an OLTP system, you would likely observe increased response times due to this. Oracle avoids certain execution techniques that it can use efficiently in a serial execution plan and adopts execution paths such as full scans in the hope that by performing many pieces of the larger, bulk operation in parallel, it would be better than the serial plan. Parallel execution, when applied inappropriately, may be the cause of your performance problem, not the solution for it.
So, before applying parallel execution, you need the following two things to be true:
If you have a small task, as generally typified by the queries carried out in an OLTP system, or you have insufficient available resources, again as is typical in an OLTP system where CPU and I/O resources are often already used to their maximum, then parallel execution is not something you'll want to consider.
I often use an analogy to describe parallel processing and why you need both a large task and sufficient free resources in the database. It goes like this: suppose you have two tasks to complete. The first is to write a one-page summary of a new product. The other is to write a ten-chapter comprehensive report, with each chapter being very much independent of the others. For example, consider this book. This chapter, "Parallel Execution," is very much separate and distinct from the chapter titled "Redo and Undo"—they did not have to be written sequentially.
How do you approach each task? Which one do you think would benefit from parallel processing?
One-Page Summary
In this analogy, the one-page summary you have been assigned is not a large task. You would either do it yourself or assign it to a single individual. Why? Because the amount of work required to "parallelize" this process would exceed the work needed just to write the paper yourself. You would have to sit down, figure out that there should be 12 paragraphs, determine that each paragraph is not dependent on the other paragraphs, hold a team meeting, pick 12 individuals, explain to them the problem and assign each person a paragraph, act as the coordinator and collect all of their paragraphs, sequence them into the right order, verify they are correct, and then print the report. This is all likely to take longer than it would to just write the paper yourself, serially. The overhead of managing a large group of people on a project of this scale will far outweigh any gains to be had from having the 12 paragraphs written in parallel.
The exact same principle applies to parallel execution in the database. If you have a job that takes seconds or less to complete serially, then the introduction of parallel execution and its associated managerial overhead will likely make the entire thing take longer.
Ten-Chapter Report
Now let's examine the second task. If you want that ten-chapter report fast—as fast as possible—the slowest way to accomplish it would be to assign all of the work to a single individual (trust me, I know—look at this book! Some days I wished there were 15 of me working on it). Here you would hold the meeting, review the process, assign the work, act as the coordinator, collect the results, bind up the finished report, and deliver it. It would not have been done in one-tenth the time, but perhaps one-eighth or so. Again, I say this with the proviso that you have sufficient free resources. If you have a large staff that is currently not actually doing anything, then splitting up the work makes complete sense.
However, consider that as the manager, your staff is multitasking and they have a lot on their plates. In that case, you have to be careful with that big project. You need to be sure not to overwhelm your staff; you don't want to work them beyond the point of exhaustion. You can't delegate out more work than your resources (your people) can cope with, otherwise they'll quit. If your staff is already fully utilized, adding more work will cause all schedules to slip and all projects to be delayed.
Parallel execution in Oracle is very much the same. If you have a task that takes many minutes, hours, or days, then the introduction of parallel execution may be the thing that makes it run eight times faster. But then again, if you are already seriously low on resources (the overworked team of people), then the introduction of parallel execution would be something to avoid, as the system will become even more bogged down. While the Oracle server processes won't "quit" in protest, they could start running out of RAM and failing, or just suffer from such long waits for I/O or CPU as to make it appear as if they were doing no work whatsoever.
If you keep that in mind, remembering never to take an analogy to illogical extremes, you'll have the commonsense guiding rule to see if parallelism can be of some use. If you have a job that takes seconds, it is doubtful that parallel execution can be used to make it go faster—the converse would be more likely. If you are low on resources already (i.e., your resources are fully utilized), adding parallel execution would likely make things worse, not better. Parallel execution is excellent for when you have a really big job and plenty of excess capacity. In this chapter, we'll take a look at some of the ways we can exploit those resources.
Parallel query allows a single SQL SELECT
statement to be divided into many smaller queries, with each component query being run concurrently, and then the results from each combined to provide the final answer. For example, consider the following query:
big_table@ORA10G> select count(status) from big_table;
Using parallel query, this query could use some number of parallel sessions; break the BIG_TABLE
into small, nonoverlapping slices; and ask each parallel session to read the table and count its section of rows. The parallel query coordinator for this session would then receive each of the aggregated counts from the individual parallel sessions and further aggregate them, returning the final answer to the client application. Graphically, it might look like Figure 14-1.
The P000
, P001
, P002
, and P003
processes are known as parallel execution servers, sometimes also referred to as parallel query (PQ) slaves. Each of these parallel execution servers is a separate session connected as if it were a dedicated server process. Each one is responsible for scanning a nonoverlapping region of BIG_TABLE
, aggregating their results subsets, and sending back their output to the coordinating server—the original session's server process—which will aggregate the subresults into the final answer.
Figure 14-1. Parallel select count (status) depiction
We can see this in an explain plan. Using a BIG_TABLE
with 10 million rows in it, we'll walk through enabling a parallel query for that table and discover how we can "see" parallel query in action. This example was performed on a four-CPU machine with default values for all parallel parameters; that is, this is an out-of-the-box installation where only necessary para-meters were set, including SGA_TARGET
(set to 1GB), CONTROL_FILES
, DB_BLOCK_SIZE
(set to 8KB), and PGA_AGGREGATE_TARGET
(set to 512MB). Initially, we would expect to see the following plan:
big_table@ORA10GR1> explain plan for
2 select count(status) from big_table;
Explained.
big_table@ORA10GR1> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------
Plan hash value: 1287793122
----------------------------------------------------------------------------
| Id | Operation | Name | Rows|Bytes | Cost (%CPU)|Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 32390 (2)|00:06:29 |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
| 2 | TABLE ACCESS FULL| BIG_TABLE | 10M| 162M| 32390 (2)|00:06:29 |
----------------------------------------------------------------------------
That is a typical serial plan. No parallelism is involved because we did not request parallel query to be enabled, and by default it will not be.
We may enable parallel query in a variety of ways, including use of a hint directly in the query or by altering the table to enable parallel execution paths to be considered (which is the option we use here).
We can specifically dictate the degree of parallelism to be considered in execution paths against this table. We can tell Oracle, "We would like you to use parallel degree 4 when creating execution plans against this table," for example:
big_table@ORA10GR1> alter table big_table parallel 4;
Table altered.
I prefer to just tell Oracle, "Please consider parallel execution, but you figure out the appropriate degree of parallelism based on the current system workload and the query itself." That is, let the degree of parallelism vary over time as the workload on the system increases and decreases. If we have plenty of free resources, the degree of parallelism will go up; in times of limited available resources, the degree of parallelism will go down. Rather than overload the machine with a fixed degree of parallelism, this approach allows Oracle to dynamically increase or decrease the amount of concurrent resources required by the query.
Therefore, we simply enable parallel query against this table via the following ALTER TABLE
command:
big_table@ORA10GR1> alter table big_table parallel;
Table altered.
That is all there is to it—parallel query will now be considered for operations against this table. When we rerun the explain plan, this time we see the following:
big_table@ORA10GR1> explain plan for
2 select count(status) from big_table;
Explained.
big_table@ORA10GR1> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------
Plan hash value: 1651916128
----------------------------------------------------------------------------
|Id | Operation | Name |Cost(%CPU)| TQ |IN-OUT|PQ Distrib |
----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 4465 (1)| | | |
| 1| SORT AGGREGATE | | | | | |
| 2| PX COORDINATOR | | | | | |
| 3| PX SEND QC (RANDOM) | :TQ10000 | |Q1,00| P->S |QC(RAND) |
| 4| SORT AGGREGATE | | |Q1,00| PCWP | |
| 5| PX BLOCK ITERATOR | | 4465 (1)|Q1,00| PCWC | |
| 6| TABLE ACCESS FULL| BIG_TABLE| 4465 (1)|Q1,00| PCWP | |
----------------------------------------------------------------------------
Note The ROWS
, BYTES
, and TIME
columns were removed from this plan output to allow it to fit on the page. The aggregate time for the query, however, was 00:00:54 as opposed to the previous estimate of 00:06:29 for the serial plan. Remember, these are estimates, not promises! Also, this is the plan output from Oracle 10g—the plan output from Oracle9i would have less detail (four steps instead of seven), but the net effect is identical.
If you read this plan from the bottom up, starting at ID=6
, it shows the steps described in Figure 14-1. The full table scan would be split up into many "smaller scans" (step 5). Each of those would aggregate their COUNT(STATUS)
values (step 4). These subresults would be transmitted to the parallel query coordinator (steps 2 and 3), which would aggregate these results further (step 1) and output the answer.
If you were to actually execute this query on a freshly started system (where no other parallel execution had been done), you might observe the following. Here we are using the Linux ps
command to find processes that represent parallel query processes (we expect to find none), running the query with parallel execution enabled, and then looking for those processes again:
big_table@ORA10GR1> host ps -auxww | grep '^ora10gr1.*ora_p00._ora10g'
big_table@ORA10GR1> select count(status) from big_table;
COUNT(STATUS)
-------------
10000000
big_table@ORA10GR1> host ps -auxww | grep '^ora10gr1.*ora_p00._ora10g'
ora10gr1 3411 35.5 0.5 1129068 12200 ? S 13:27 0:02 ora_p000_ora10gr1
ora10gr1 3413 28.0 0.5 1129064 12196 ? S 13:27 0:01 ora_p001_ora10gr1
ora10gr1 3415 26.0 0.5 1129064 12196 ? S 13:27 0:01 ora_p002_ora10gr1
ora10gr1 3417 23.3 0.5 1129044 12212 ? S 13:27 0:01 ora_p003_ora10gr1
ora10gr1 3419 19.5 0.5 1129040 12228 ? S 13:27 0:01 ora_p004_ora10gr1
ora10gr1 3421 19.1 0.5 1129056 12188 ? S 13:27 0:01 ora_p005_ora10gr1
ora10gr1 3423 19.0 0.5 1129056 12164 ? S 13:27 0:01 ora_p006_ora10gr1
ora10gr1 3425 21.6 0.5 1129048 12204 ? S 13:27 0:01 ora_p007_ora10gr1
As we can see, there are now eight parallel execution servers that have been started by Oracle. If we are curious enough to want to "watch" parallel query, we can easily do so using two sessions. In the session we will run the parallel query in, we'll start by determining our SID:
big_table@ORA10GR1> select sid from v$mystat where rownum = 1;
SID
----------
162
Now, in another session, we get this query ready to run:
ops$tkyte@ORA10GR1> select sid, qcsid, server#, degree
2 from v$px_session
3 where qcsid = 162
Shortly after starting the parallel query in the session with SID=162
, we come back to this second session and run the query:
4 /
SID QCSID SERVER# DEGREE
---------- ---------- ---------- ----------
145 162 1 8
150 162 2 8
147 162 3 8
151 162 4 8
146 162 5 8
152 162 6 8
143 162 7 8
144 162 8 8
162 162
9 rows selected.
We see here that our parallel query session (SID=162
) is the query coordinator SID (QCSID) for nine rows in this dynamic performance view. Our session is "coordinating" or controlling these parallel query resources now. We can see each has its own SID; in fact, each is a separate Oracle session and shows up as such in V$SESSION
during the execution of our parallel query:
ops$tkyte@ORA10GR1> select sid, username, program
2 from v$session
3 where sid in ( select sid
4 from v$px_session
5 where qcsid = 162 )
6 /
SID USERNAME PROGRAM
---------- ------------------------------ -------------------------------
143 BIG_TABLE oracle@dellpe (P005)
144 BIG_TABLE oracle@dellpe (P002)
145 BIG_TABLE oracle@dellpe (P006)
146 BIG_TABLE oracle@dellpe (P004)
147 BIG_TABLE oracle@dellpe (P003)
150 BIG_TABLE oracle@dellpe (P001)
151 BIG_TABLE oracle@dellpe (P000)
153 BIG_TABLE oracle@dellpe (P007)
162 BIG_TABLE sqlplus@dellpe (TNS V1-V3)
9 rows selected.
Note If a parallel execution is not occurring in your system, do not expect to see the parallel execution servers in V$SESSION
. They will be in V$PROCESS
, but will not have a session established unless they are being used. The parallel execution servers will be connected to the database, but will not have a session established. See Chapter 5 for details on the difference between a session and a connection.
In a nutshell, that is how parallel query—and, in fact, parallel execution in general—works. It entails a series of parallel execution servers working in tandem to produce subresults that are fed either to other parallel execution servers for further processing or to the coordinator for the parallel query.
In this particular example, as depicted, we had BIG_TABLE
spread across four separate devices, in a single tablespace (a tablespace with four data files). When implementing parallel execution, it is generally "optimal" to have your data spread over as many physical devices as possible. You can achieve this in a number of ways:
BIG_TABLE
over many disksBIG_TABLE
segment in many filesIn general, parallel execution works best when given access to as many resources (CPU, memory, and I/O) as possible. However, that is not to say that nothing can be gained from parallel query if the entire set of data were on a single disk, but you would perhaps not gain as much as would be gained using multiple disks. The reason you would likely gain some speed in response time, even when using a single disk, is that when a given parallel execution server is counting rows it is not reading them, and vice versa. So, two parallel execution servers may well be able to complete the counting of all rows in less time than a serial plan would.
Likewise, you can benefit from parallel query even on a single CPU machine. It is doubtful that a serial SELECT COUNT(*)
would use 100 percent of the CPU on a single CPU machine—it would be spending part of its time performing (and waiting for) physical I/O to disk. Parallel query would allow you to fully utilize the resources (the CPU and I/O, in this case) on the machine, whatever those resources may be.
That final point brings us back to the earlier quote from Practical Oracle8i: Building Efficient Databases: parallel query is essentially nonscalable. If you allowed four sessions to simultaneously perform queries with two parallel execution servers on that single CPU machine, you would probably find their response times to be longer than if they just processed serially. The more processes clamoring for a scarce resource, the longer it will take to satisfy all requests.
And remember, parallel query requires two things to be true. First, you need to have a large task to perform—for example, a long-running query, the runtime of which is measured in minutes, hours, or days, not in seconds or subseconds. This implies that parallel query is not a solution to be applied in a typical OLTP system, where you are not performing long-running tasks. Enabling parallel execution on these systems is often disastrous. Second, you need ample free resources such as CPU, I/O, and memory. If you are lacking in any of these, then parallel query may well push your utilization of that resource over the edge, negatively impacting overall performance and runtime.
In the past, parallel query was considered mandatory for many data warehouses simply because in the past (say, in 1995) data warehouses were rare and typically had a very small, focused user base. Today in 2005, data warehouses are literally everywhere and support user communities that are as large as those found for many transactional systems. This means that you may well not have sufficient free resources at any given point in time to enable parallel query on these systems. That doesn't mean parallel execution in general is not useful in this case—it just might be more of a DBA tool, as we'll see in the section "Parallel DDL," rather than a parallel query tool.
The Oracle documentation limits the scope of the term DML (PDML) to include only INSERT
, UPDATE
, DELETE
, and MERGE
(it does not include SELECT
as normal DML does). During PDML, Oracle may use many parallel execution servers to perform your INSERT
, UPDATE
, DELETE
, or MERGE
instead of a single serial process. On a multi-CPU machine with plenty of I/O bandwidth, the potential increase in speed may be large for mass DML operations.
However, you should not look to PDML as a feature to speed up your OLTP-based applications. As stated previously, parallel operations are designed to fully and totally maximize the utilization of a machine. They are designed so that a single user can completely use all of the disks, CPU, and memory on the machine. In certain data warehouses (with lots of data and few users), this is something you may want to achieve. In an OLTP system (with a lot of users all doing short, fast transactions), you do not want to give a user the ability to fully take over the machine resources.
This sounds contradictory: we use parallel query to scale up, so how could it not be scalable? When applied to an OLTP system, the statement is quite accurate. Parallel query is not something that scales up as the number of concurrent users increases. Parallel query was designed to allow a single session to generate as much work as 100 concurrent sessions would. In our OLTP system, we really do not want a single user to generate the work of 100 users.
PDML is useful in a large data warehousing environment to facilitate bulk updates to massive amounts of data. The PDML operation is executed in much the same way as a distributed query would be executed by Oracle, with each parallel execution server acting like a process in a separate database instance. Each slice of the table is modified by a separate thread with its own independent transaction (and hence its own undo segment, hopefully). After they are all done, the equivalent of a fast 2PC is performed to commit the separate, independent transactions. Figure 14-2 depicts a parallel update using four parallel execution servers. Each of the parallel execution servers has its own independent transaction, in which either all are committed with the PDML coordinating session or none commit.
We can actually observe the fact that there are separate independent transactions created for the parallel execution servers. We'll use two sessions again, as before. In the session with SID=162
, we explicitly enable parallel DML. PDML differs from parallel query in that regard; unless you explicitly ask for it, you will not get it.
big_table@ORA10GR1> alter session enable parallel dml;
Session altered.
Figure 14-2. Parallel update (PDML) depiction
The fact that the table is "parallel" is not sufficient, as it was for parallel query. The reasoning behind the need to explicitly enable PDML in your session is the fact that PDML has certain limitations associated with it, which I list after this example.
In the same session, we do a bulk UPDATE
that, because the table is "parallel enabled," will in fact be done in parallel:
big_table@ORA10GR1> update big_table set status = 'done';
In the other session, we'll join V$SESSION
to V$TRANSACTION
to show the active sessions for our PDML operation, as well as their independent transaction information:
ops$tkyte@ORA10GR1> select a.sid, a.program, b.start_time, b.used_ublk,
2 b.xidusn ||'.'|| b.xidslot || '.' || b.xidsqn trans_id
3 from v$session a, v$transaction b
4 where a.taddr = b.addr
5 and a.sid in ( select sid
6 from v$px_session
7 where qcsid = 162 )
8 order by sid
9 /
SID PROGRAM START_TIME USED_UBLK TRANS_ID
---- -------------------------- -------------------- ---------- ------------
136 oracle@dellpe (P009) 08/03/05 14:28:17 6256 18.9.37
137 oracle@dellpe (P013) 08/03/05 14:28:17 6369 21.39.225
138 oracle@dellpe (P015) 08/03/05 14:28:17 6799 24.16.1175
139 oracle@dellpe (P008) 08/03/05 14:28:17 6729 15.41.68
140 oracle@dellpe (P014) 08/03/05 14:28:17 6462 22.41.444
141 oracle@dellpe (P012) 08/03/05 14:28:17 6436 20.46.46
142 oracle@dellpe (P010) 08/03/05 14:28:17 6607 19.44.37
143 oracle@dellpe (P007) 08/03/05 14:28:17 1 17.12.46
144 oracle@dellpe (P006) 08/03/05 14:28:17 1 13.25.302
145 oracle@dellpe (P003) 08/03/05 14:28:17 1 1.21.1249
146 oracle@dellpe (P002) 08/03/05 14:28:17 1 14.42.49
147 oracle@dellpe (P005) 08/03/05 14:28:17 1 12.18.323
150 oracle@dellpe (P011) 08/03/05 14:28:17 6699 23.2.565
151 oracle@dellpe (P004) 08/03/05 14:28:17 1 16.26.46
152 oracle@dellpe (P001) 08/03/05 14:28:17 1 11.13.336
153 oracle@dellpe (P000) 08/03/05 14:28:17 1 2.29.1103
162 sqlplus@dellpe (TNS V1-V3) 08/03/05 14:25:46 2 3.13.2697
17 rows selected.
As we can see, there is more happening here than when we simply queried the table in parallel. We have 17 processes working on this operation, not just 9 as before. This is because the plan that was developed includes a step to update the table and independent steps to update the index entries. Looking at an edited explain plan output from DBMS_XPLAN
(trailing columns were removed to permit the output to fit on the page), we see the following:
----------------------------------------------
| Id | Operation | Name |
----------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 |
| 3 | INDEX MAINTENANCE | BIG_TABLE |
| 4 | PX RECEIVE | |
| 5 | PX SEND RANGE | :TQ10000 |
| 6 | UPDATE | BIG_TABLE |
| 7 | PX BLOCK ITERATOR | |
| 8 | TABLE ACCESS FULL| BIG_TABLE |
-----------------------------------------------
As a result of the pseudo-distributed implementation of PDML, certain limitations are associated with it:
If you violate any of those restrictions, one of two things will happen: either the statement will be performed serially (no parallelism will be involved) or an error will be raised. For example, if you already performed the PDML against table T
and then attempted to query table T
before ending your transaction, then you will receive an error.
I believe that parallel DDL is the real "sweet spot" of Oracle's parallel technology. As we've discussed, parallel execution is generally not appropriate for OLTP systems. In fact, for many data warehouses, parallel query is becoming less and less of an option. It used to be that a data warehouse was built for a very small, focused user community—sometimes consisting of just one or two analysts. However, over the last decade I've watched them grow from small user communities to user communities of hundreds or thousands. Consider a data warehouse front-ended by a web-based application: it could be accessible to literally thousands or more users instantly.
But a DBA performing the large batch operations, perhaps during a maintenance window, is a different story. The DBA is still a single individual and he might have a huge machine with tons of computing resources available. The DBA has only "one thing to do": load this data, reorganize that table, rebuild that index. Without parallel execution, the DBA would be hard-pressed to really use the full capabilities of the hardware. With parallel execution, he can. The following SQL DDL commands permit "parallelization":
CREATE INDEX
: Multiple parallel execution servers can scan the table, sort the data, and write the sorted segments out to the index structure.CREATE TABLE AS SELECT
: The query that executes the SELECT
may be executed using parallel query, and the table load itself may be done in parallel.ALTER INDEX REBUILD
: The index structure may be rebuilt in parallel.ALTER TABLE MOVE
: A table may be moved in parallel.ALTER TABLE SPLIT|COALESCE PARTITION
: The individual table partitions may be split or coalesced in parallel.ALTER INDEX SPLIT PARTITION:
An index partition may be split in parallel.The first four of these commands work for individual table/index partitions as well—that is, you may MOVE
an individual partition of a table in parallel.
To me, parallel DDL is where the parallel execution in Oracle is of greatest measurable benefit. Sure, it can be used with parallel query to speed up certain long-running operations, but from a maintenance standpoint, and from an administration standpoint, parallel DDL is where the parallel operations affect us, DBAs and developers, the most. If you think of parallel query as being designed for the end user for the most part, then parallel DDL is designed for the DBA/developer.
One of my favorite new features in Oracle 9i is external tables, which are especially useful in the area of data loading. We'll cover data loading and external tables in some detail in the next chapter but, as a quick introduction, we'll take a brief look at these topics here to study the effects of parallel DDL on extent sizing and extent trimming.
External tables allow us to easily perform parallel direct path loads without thinking too hard about it. Oracle 7.1 gave us the ability to perform parallel direct path loads, whereby multiple sessions could write directly to the Oracle data files, bypassing the buffer cache entirely, bypassing undo for the table data, and perhaps even bypassing redo generation. This was accomplished via SQL*Loader. The DBA would have to script multiple SQL*Loader sessions, split the input data files to be loaded manually, determine the degree of parallelism, and coordinate all of the SQL*Loader processes. In short, it could be done, but it was hard.
With parallel DDL plus external tables, we have a parallel direct path load that is implemented via a simple CREATE TABLE AS SELECT or INSERT /*+ APPEND */
. No more scripting, no more splitting of files, and no more coordinating the N number of scripts that would be running. In short, this combination provides pure ease of use, without a loss of performance.
Let's take a look at a simple example of this in action. We'll see shortly how to create an external table. We'll look at data loading with external tables in much more detail in the next chapter. For now, we'll use a "real" table to load another table from, much like many people do with staging tables in their data warehouse. The technique in short is as follows:
We'll use the same BIG_TABLE
from earlier, which is parallel-enabled and contains 10 million records. We're going to join this table to a second table, USER_INFO
, which contains OWNER
-related information from the ALL_USERS
dictionary view. The goal is to denormalize this information into a flat structure.
We'll start by creating the USER_INFO
table, enabling it for parallel operations, and then gathering statistics on it:
big_table@ORA10GR1> create table user_info as select * from all_users;
Table created.
big_table@ORA10GR1> alter table user_info parallel;
Table altered.
big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' );
PL/SQL procedure successfully completed.
Now, we would like to parallel direct path load a new table with this information. The query we'll use is simply
create table new_table parallel
as
select a.*, b.user_id, b.created user_created
from big_table a, user_info b
where a.owner = b.username
The plan for that particular CREATE TABLE AS SELECT
looked like this in Oracle 10g:
---------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | | Q1,01 | PCWP | |
|* 4 | HASH JOIN | | Q1,01 | PCWP | |
| 5 | PX RECEIVE | | Q1,01 | PCWP | |
| 6 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST |
| 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 8 | TABLE ACCESS FULL| USER_INFO | Q1,00 | PCWP | |
| 9 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 10 | TABLE ACCESS FULL | BIG_TABLE | Q1,01 | PCWP | |
---------------------------------------------------------------------------
If you look at the steps from 4 on down, that is the query (SELECT
) component. The scan of BIG_TABLE
and hash join to USER_INFO
was performed in parallel, and each of the subresults was loaded into a portion of the table (step 3, the LOAD AS SELECT
). After each of the parallel execution servers finishes its part of the join and load, it sends its results up to the query coordinator. In this case, the results simply indicated "success" or "failure," as the work had already been performed.
And that is all there is to it—parallel direct path loads made easy. The most important thing to consider with these operations is how space is used (or not used). Of particular importance is a side effect called extent trimming. I'd like to spend some time investigating that now.
Parallel DDL relies on direct path operations. That is, the data is not passed to the buffer cache to be written later; rather, an operation such as a CREATE TABLE AS SELECT
will create new extents and write directly to them, and the data goes straight from the query to disk, in those newly allocated extents. Each parallel execution server performing its part of the CREATE TABLE AS SELECT
will write to its own extent. The INSERT /*+ APPEND */
(a direct path insert) writes "above" a segment's HWM, and each parallel execution server will again write to its own set of extents, never sharing them with other parallel execution servers. Therefore, if you do a parallel CREATE TABLE AS SELECT
and use four parallel execution servers to create the table, then you will have at least four extents—maybe more. But each of the parallel execution servers will allocate its own extent, write to it and, when it fills up, allocate another new extent. The parallel execution servers will never use an extent allocated by some other parallel execution server.
Figure 14-3 depicts this process. We have a CREATE TABLE NEW_TABLE AS SELECT
being executed by four parallel execution servers. In the figure, each parallel execution server is represented by a different color (white, light gray, dark gray, or black). The boxes in the "disk drum" represent the extents that were created in some data file by this CREATE TABLE
statement. Each extent is presented in one of the aforementioned four colors, for the simple reason that all of the data in any given extent was loaded by only one of the four parallel execution servers—P003
is depicted as having created and then loaded four of these extents. P000
, on the other hand, is depicted as having five extents, and so on.
Figure 14-3. Parallel DDL extent allocation depiction
This sounds all right at first, but in a data warehouse environment, this can lead to "wastage" after a large load. Let's say you want to load 1,010MB of data (about 1GB), and you are using a tablespace with 100MB extents. You decide to use ten parallel execution servers to load this data. Each would start by allocating its own 100MB extent (there will be ten of them in all) and filling it up. Since each has 101MB of data to load, it would fill up its first extent and then proceed to allocate another 100MB extent, of which it would use 1MB. You now have 20 extents, 10 of which are full and 10 of which have 1MB each, and 990MB is "allocated but not used." This space could be used the next time you load, but right now you have 990MB of dead space. This is where extent trimming comes in. Oracle will attempt to take the last extent of each parallel execution server and "trim" it back to the smallest size possible.
Extent Trimming and Dictionary-Managed Tablespaces
If you are using legacy dictionary-managed tablespaces, then Oracle will be able to convert each of the 100MB extents that contain just 1MB of data into 1MB extents. Unfortunately, that would (in dictionary-managed tablespaces) tend to leave ten noncontiguous 99MB extents free, and since your allocation scheme was for 100MB extents, this 990MB of space would not be very useful! The next allocation of 100MB would likely not be able to use the existing space, since it would be 99MB of free space, followed by 1MB of allocated space, followed by 99MB of free space, and so on. We will not review the dictionary-managed approach further in this book.
Extent Trimming and Locally-Managed Tablespaces
Enter locally-managed tablespaces. Here we have two types: UNIFORM SIZE
, whereby every extent in the tablespace is always precisely the same size, and AUTOALLOCATE
, whereby Oracle decides how big each extent should be using an internal algorithm. Both of these approaches solve nicely the problem of the 99MB of free space, followed by 1MB of used space, followed by 99MB of free space, and so on, resulting in lots of free space that cannot be used. However, they each solve it very differently.
The UNIFORM SIZE
approach obviates extent trimming from consideration all together. When you use UNIFORM SIZE
s, Oracle cannot perform extent trimming. All extents are of that single size—none can be smaller (or larger) than that single size.
AUTOALLOCATE
extents, on the other hand, do support extent trimming, but in an intelligent fashion. They use a few specific sizes of extents and have the ability to use space of different sizes—that is, the algorithm permits the use of all free space over time in the tablespace. Unlike the dictionary-managed tablespace, where if you request a 100MB extent, Oracle will fail the request if it can find only 99MB free extents (so close, yet so far), a locally-managed tablespace with AUTOALLOCATE
extents can be more flexible. It may reduce the size of the request it was making in order to attempt to use all of the free space.
Let's now look at the differences between the two locally-managed tablespace approaches. To do that, we need a real-life example to work with. We'll set up an external table capable of being used in a parallel direct path load situation, which is something that we do frequently. Even if you are still using SQL*Loader to parallel direct path load data, this section applies entirely—you just have manual scripting to do to actually load the data. So, in order to investigate extent trimming, we need to set up our example load and then perform the loads under varying conditions and examine the results.
To get started, we need an external table. I've found time and time again that I have a legacy control file from SQL*Loader that I used to use to load data. One that looks like this, for example:
LOAD DATA
INFILE '/tmp/big_table.dat'
INTO TABLE big_table
REPLACE
FIELDS TERMINATED BY '|'
(
id, owner, object_name, subobject_name, object_id
,data_object_id, object_type, created, last_ddl_time
,timestamp, status, temporary, generated, secondary
)
We can convert this easily into an external table definition using SQL*Loader itself:
$ sqlldr big_table/big_table big_table.ctl external_table=generate_only
SQL*Loader: Release 10.1.0.3.0 - Production on Mon Jul 11 14:16:20 2005
Copyright (c) 1982, 2004, Oracle. All rights reserved.
Notice the parameter EXTERNAL_TABLE
passed to SQL*Loader. It causes SQL*Loader in this case to not load data, but rather to generate a CREATE TABLE
statement for us in the log file. This CREATE TABLE
statement looked as follows (this is an abridged form; I've edited out repetitive elements to make the example smaller):
CREATE TABLE "SYS_SQLLDR_X_EXT_BIG_TABLE"
(
"ID" NUMBER,
...
"SECONDARY" VARCHAR2(1)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'big_table.bad'
LOGFILE 'big_table.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY "|" LDRTRIM
REJECT ROWS WITH ALL NULL FIELDS
(
"ID" CHAR(255)
TERMINATED BY "|",
....
"SECONDARY" CHAR(255)
TERMINATED BY "|"
)
)
location
(
'big_table.dat'
)
)REJECT LIMIT UNLIMITED
All we need to do is edit that a bit to name the external table the way we want; change the directories, perhaps; and so on:
ops$tkyte@ORA10GR1> create or replace directory my_dir as '/tmp/'
2 /
Directory created.
And after that, all we need to do is actually create the table:
ops$tkyte@ORA10GR1> CREATE TABLE "BIG_TABLE_ET"
2 (
3 "ID" NUMBER,
...
16 "SECONDARY" VARCHAR2(1)
17 )
18 ORGANIZATION external
19 (
20 TYPE oracle_loader
21 DEFAULT DIRECTORY MY_DIR
22 ACCESS PARAMETERS
23 (
24 RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8ISO8859P1
25 READSIZE 1048576
26 FIELDS TERMINATED BY "|" LDRTRIM
27 REJECT ROWS WITH ALL NULL FIELDS
28 )
29 location
30 (
31 'big_table.dat'
32 )
33 )REJECT LIMIT UNLIMITED
34 /
Table created.
Then we make this table parallel enabled. This is the magic step—this is what will facilitate an easy parallel direct path load:
ops$tkyte@ORA10GR1> alter table big_table_et PARALLEL;
Table altered.
Note The PARALLEL
clause may also be used on the CREATE TABLE
statement itself. Right after the REJECT LIMIT UNLIMITED
, the keyword PARALLEL
could have been added. I used the ALTER
statement just to draw attention to the fact that the external table is, in fact, parallel enabled.
Extent Trimming with UNIFORM vs. AUTOALLOCATE Locally-Managed Tablespaces
That's all we need to do to set up the load component. Now, we would like to investigate how space is managed in a locally-managed tablespace (LMT) that uses UNIFORM
extent sizes, compared to how space is managed in an LMT that AUTOALLOCATE
s extents. In this case, we'll use 100MB extents. First we create LMT_UNIFORM
, which uses uniform extent sizes:
ops$tkyte@ORA10GR1> create tablespace lmt_uniform
2 datafile '/u03/ora10gr1/lmt_uniform.dbf' size 1048640K reuse
3 autoextend on next 100m
4 extent management local
5 uniform size 100m;
Tablespace created.
Next, we create LMT_AUTO
, which uses AUTOALLOCATE
to determine extent sizes:
ops$tkyte@ORA10GR1> create tablespace lmt_auto
2 datafile '/u03/ora10gr1/lmt_auto.dbf' size 1048640K reuse
3 autoextend on next 100m
4 extent management local
5 autoallocate;
Tablespace created.
Each tablespace started with a 1GB data file (plus 64KB used by LMTs to manage the storage; it would be 128KB extra instead of 64KB if we were to use a 32KB blocksize). We permit these data files to autoextend 100MB at a time. We are going to load this file:
$ ls -lag big_table.dat
-rw-rw-r-- 1 tkyte 1067107251 Jul 11 13:46 big_table.dat
which is a 10,000,000-record file. It was created using the big_table.sql
script found in the "Setting Up" section at the beginning of this book and then unloaded using the flat.sql
script available at http://asktom.oracle.com/~tkyte/flat/index.html. Next, we do a parallel direct path load of this file into each tablespace:
ops$tkyte@ORA10GR1> create table uniform_test
2 parallel
3 tablespace lmt_uniform
4 as
5 select * from big_table_et;
Table created.
ops$tkyte@ORA10GR1> create table autoallocate_test
2 parallel
3 tablespace lmt_auto
4 as
5 select * from big_table_et;
Table created.
On my system, which has four CPUs, these CREATE TABLE
statements executed with eight parallel execution servers and one coordinator. I verified that was the case by querying one of the dynamic performance views related to parallel execution, V$PX_SESSION
, while these statements were running:
sys@ORA10GR1> select sid, serial#, qcsid, qcserial#, degree
2 from v$px_session;
SID SERIAL# QCSID QCSERIAL# DEGREE
---------- ---------- ---------- ---------- ----------
137 17 154 998 8
139 13 154 998 8
141 17 154 998 8
150 945 154 998 8
161 836 154 998 8
138 8 154 998 8
147 15 154 998 8
143 41 154 998 8
154 998 154
9 rows selected.
Note In creating the UNIFORM_TEST
and AUTOALLOCATE_TEST
tables, we simply specified "parallel" on each table, with Oracle choosing the degree of parallelism. In this case, I was the sole user of the machine (all resources available) and Oracle defaulted it to 8 based on the number of CPUs (four) and the PARALLEL_THREADS_PER_CPU
parameter setting, which defaults to 2.
The SID,SERIAL#
are the identifiers of the parallel execution sessions, and the QCSID,QCSERIAL#
is the identifier of the query coordinator of the parallel execution. So, with eight parallel execution sessions running, we would like to see how the space was used. A quick query against USER_SEGMENTS
gives us a good idea:
ops$tkyte@ORA10GR1> select segment_name, blocks, extents
2 from user_segments
3 where segment_name in ( 'UNIFORM_TEST', 'AUTOALLOCATE_TEST' );
SEGMENT_NAME BLOCKS EXTENTS
--------------- ---------- ----------
UNIFORM_TEST 204800 16
AUTOALLOCATE_TEST 145592 714
Since we were using an 8KB blocksize, that shows a difference of about 462MB, or looking at it from ratio perspective, AUTOALLOCATE_TEST
is about 70 percent the size of UNIFORM_TEST
as far as allocated space goes. If we look at the actual used space,
ops$tkyte@ORA10GR1> exec show_space('UNIFORM_TEST' );
Free Blocks............................. 59,224
Total Blocks............................ 204,800
Total Bytes............................. 1,677,721,600
Total MBytes............................ 1,600
Unused Blocks........................... 0
Unused Bytes............................ 0
Last Used Ext FileId.................... 6
Last Used Ext BlockId................... 9
Last Used Block......................... 12,800
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR1> exec show_space('AUTOALLOCATE_TEST' );
Free Blocks............................. 16
Total Blocks............................ 145,592
Total Bytes............................. 1,192,689,664
Total MBytes............................ 1,137
Unused Blocks........................... 0
Unused Bytes............................ 0
Last Used Ext FileId.................... 8
Last Used Ext BlockId................... 41
Last Used Block......................... 8
PL/SQL procedure successfully completed.
Note The SHOW_SPACE
procedure is described in the "Setting Up" section at the beginning of this book.
we can see that if we take away the blocks on the freelist for UNIFORM_TEST
—59,224 of them—the tables consume about the same amount of actual space, but the amount of space needed by the UNIFORM
tablespace is considerably more. This is all due to the extent trimming that did not take place. It we look at UNIFORM_TEST
, we see this clearly:
ops$tkyte@ORA10GR1> select segment_name, extent_id, blocks
2 from user_extents where segment_name = 'UNIFORM_TEST';
SEGMENT_NAME EXTENT_ID BLOCKS
--------------- ---------- ----------
UNIFORM_TEST 0 12800
UNIFORM_TEST 1 12800
UNIFORM_TEST 2 12800
UNIFORM_TEST 3 12800
UNIFORM_TEST 4 12800
UNIFORM_TEST 5 12800
UNIFORM_TEST 6 12800
UNIFORM_TEST 7 12800
UNIFORM_TEST 8 12800
UNIFORM_TEST 9 12800
UNIFORM_TEST 10 12800
UNIFORM_TEST 11 12800
UNIFORM_TEST 12 12800
UNIFORM_TEST 13 12800
UNIFORM_TEST 14 12800
UNIFORM_TEST 15 12800
16 rows selected.
Each extent is 100MB in size. Now, it would be a waste of paper to list all 714 extents, so let's look at them in aggregate:
ops$tkyte@ORA10GR1> select segment_name, blocks, count(*)
2 from user_extents
3 where segment_name = 'AUTOALLOCATE_TEST'
4 group by segment_name, blocks
5 /
SEGMENT_NAME BLOCKS COUNT(*)
----------------- ---------- ----------
AUTOALLOCATE_TEST 8 128
AUTOALLOCATE_TEST 128 504
AUTOALLOCATE_TEST 240 1
AUTOALLOCATE_TEST 392 1
AUTOALLOCATE_TEST 512 1
AUTOALLOCATE_TEST 656 1
AUTOALLOCATE_TEST 752 5
AUTOALLOCATE_TEST 768 1
AUTOALLOCATE_TEST 1024 72
9 rows selected.
This generally fits in with how LMTs with AUTOALLOCATE
are observed to allocate space. The 8, 128, and 1,024 block extents are "normal"; we will observe them all of the time with AUTOALLOCATE
. The rest, however, are not "normal"; we do not usually observe them. They are due to the extent trimming that takes place. Some of the parallel execution servers finished their part of the load—they took their last 8MB (1,024 blocks) extent and trimmed it, resulting in a spare bit left over. One of the other parallel execution sessions, as it needed space, could use this spare bit. In turn, as these other parallel execution sessions finished processing their own loads, they would trim their last extent and leave spare bits of space.
Which approach should you use? If your goal is to direct path load in parallel as often as possible, I suggest AUTOALLOCATE
as your extent management policy. Parallel direct path operations like this will not use use space under the object's HWM—the space on the freelist. Unless you do some conventional path inserts into these tables also, UNIFORM
allocation will permanently have additional free space in it that it will never use. Unless you can size the extents for the UNIFORM
LMT to be much smaller, you will see what I term excessive wastage over time—and remember that this space is associated with the segment and will be included in a full scan of the table.
To demonstrate this, let's do another parallel direct path load into these existing tables, using the same inputs:
ops$tkyte@ORA10GR1> alter session enable parallel dml;
Session altered.
ops$tkyte@ORA10GR1> insert /*+ append */ into UNIFORM_TEST
2 select * from big_table_et;
10000000 rows created.
ops$tkyte@ORA10GR1> insert /*+ append */ into AUTOALLOCATE_TEST
2 select * from big_table_et;
10000000 rows created.
ops$tkyte@ORA10GR1> commit;
Commit complete.
If we compare the space utilization of the two tables after that operation as follows:
ops$tkyte@ORA10GR1> exec show_space( 'UNIFORM_TEST' );
Free Blocks............................. 118,463
Total Blocks............................ 409,600
Total Bytes............................. 3,355,443,200
Total MBytes............................ 3,200
Unused Blocks........................... 0
Unused Bytes............................ 0
Last Used Ext FileId.................... 6
Last Used Ext BlockId................... 281,609
Last Used Block......................... 12,800
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR1> exec show_space( 'AUTOALLOCATE_TEST' );
Free Blocks............................. 48
Total Blocks............................ 291,184
Total Bytes............................. 2,385,379,328
Total MBytes............................ 2,274
Unused Blocks........................... 0
Unused Bytes............................ 0
Last Used Ext FileId.................... 8
Last Used Ext BlockId................... 140,025
Last Used Block......................... 8
PL/SQL procedure successfully completed.
we can see that as we load more and more data into the table UNIFORM_TEST
using parallel direct path operations, the space utilization gets worse over time. We would want to use a significantly smaller uniform extent size or use the AUTOALLOCATE
. The AUTOALLOCATE
may well generate more extents over time, but the space utilization is superior due to the extent trimming that takes place.
Another form of parallel execution in Oracle is the ability to perform parallel recovery. Parallel recovery may be performed at the instance level, perhaps by increasing the speed of a recovery that needs to be performed after a software, operating system, or general system failure. Parallel recovery may also be applied during media recovery (e.g., restoration from backups). It is not my goal to cover recovery-related topics in this book, so I'll just mention the existence of parallel recovery in passing. I recommend the following Oracle manuals for further reading on the topic:
I would like to discuss two types of procedural parallelism:
Many times you'll find that applications—typically batch processes—designed to execute serially will look something like the following procedure:
Create procedure process_data
As
Begin
For x in ( select * from some_table )
Perform complex process on X
Update some other table, or insert the record somewhere else
End loop
end
In this case, Oracle's parallel query or PDML won't help a bit (in fact, parallel execution of the SQL by Oracle here would likely only cause the database to consume more resources and take longer). If Oracle were to execute the simple SELECT * FROM SOME_TABLE
in parallel, it would provide this algorithm no apparent increase in speed whatsoever. If Oracle were to perform in parallel the UPDATE
or INSERT
after the complex process, it would have no positive affect (it is a single-row UPDATE
/INSERT
, after all).
There is one obvious thing you could do here: use array processing for the UPDATE
/INSERT
after the complex process. However, that isn't going to give you a 50 percent reduction or more in runtime, and often that is what you are looking for. Don't get me wrong, you definitely want to implement array processing for the modifications here, but it won't make this process run two, three, four, or more times faster.
Now, suppose this process runs at night on a machine with four CPUs, and it is the only activity taking place. You have observed that only one CPU is partially used on this system, and the disk system is not being used very much at all. Further, this process is taking hours, and every day it takes a little longer as more data is added. You need to reduce the runtime by many times—it needs to run four or eight times faster—so incremental percentage increases will not be sufficient. What can you do?
There are two approaches you can take. One approach is to implement a parallel pipe-lined function, whereby Oracle will decide on appropriate degrees of parallelism (assuming you have opted for that, which is recommended). Oracle will create the sessions, coordinate them, and run them, very much like the previous example with parallel DDL where, by using CREATE TABLE AS SELECT OR INSERT /*+APPEND*/
, Oracle fully automated parallel direct path loads for us. The other approach is DIY parallelism. We'll take a look at both approaches in the sections that follow.
We'd like to take that very serial process PROCESS_DATA
from earlier and have Oracle execute it in parallel for us. To accomplish this, we need to turn the routine "inside out." Instead of selecting rows from some table, processing them, and inserting them into another table, we will insert into another table the results of fetching some rows and processing them. We will remove the INSERT
at the bottom of that loop and replace it in the code with a PIPE ROW
clause. The PIPE ROW
clause allows our PL/SQL routine to generate table data as its output, so we'll be able to SELECT
from our PL/SQL process. The PL/SQL routine that used to procedurally process the data becomes a table, in effect, and the rows we fetch and process are the outputs. We've seen this many times throughout this book every time we've issued the following:
Select * from table(dbms_xplan.display);
That is a PL/SQL routine that reads the PLAN_TABLE
; restructures the output, even to the extent of adding rows; and then outputs this data using PIPE ROW
to send it back to the client. We're going to do the same thing here in effect, but we'll allow for it to be processed in parallel.
We're going to use two tables in this example: T1
and T2
. T1
is the table we were reading previously, and T2
is the table we need to move this information into. Assume this is some sort of ETL process we run to take the transactional data from the day and convert it into reporting information for tomorrow. The two tables we'll use are as follows:
ops$tkyte-ORA10G> create table t1
2 as
3 select object_id id, object_name text
4 from all_objects;
Table created.
ops$tkyte-ORA10G> begin
2 dbms_stats.set_table_stats
( user, 'T1', numrows=>10000000,numblks=>100000 );
4 end;
5 /
PL/SQL procedure successfully completed.
ops$tkyte-ORA10G> create table t2
2 as
3 select t1.*, 0 session_id
4 from t1
5 where 1=0;
Table created.
We used DBMS_STATS
to "trick" the optimizer into thinking that there are 10,000,000 rows in that input table and that it consumes 100,000 database blocks. We want to simulate a big table here. The second table, T2
, is simply a copy of the first table's structure with the addition of a SESSION_ID
column. That column will be useful to actually "see" the parallelism that takes place.
Next, we need to set up object types for our pipelined function to return. The object type is simply a structural definition of the "output" of the procedure we are converting. In this case, it looks just like T2
:
ops$tkyte-ORA10G> CREATE OR REPLACE TYPE t2_type
2 AS OBJECT (
3 id number,
4 text varchar2(30),
5 session_id number
6 )
7 /
Type created.
ops$tkyte-ORA10G> create or replace type t2_tab_type
2 as table of t2_type
3 /
Type created.
And now for the pipelined function, which is simply the original PROCESS_DATA
procedure rewritten. The procedure is now a function that produces rows. It accepts as an input the data to process in a ref cursor. The function returns a T2_TAB_TYPE
, the type we just created. It is a pipelined function that is PARALLEL_ENABLED
. The partition clause we are using says to Oracle, "Partition, or slice up, the data by any means that work best. We don't need to make any assumptions about the order of the data."
You may also use hash or range partitioning on a specific column in the ref cursor. That would involve using a strongly typed ref cursor, so the compiler knows what columns are available. Hash partitioning would just send equal rows to each parallel execution server to process based on a hash of the column supplied. Range partitioning would send nonoverlapping ranges of data to each parallel execution server, based on the partitioning key. For example, if you range partitioned on ID
, each parallel execution server might get ranges 1...1000, 1001...20000, 20001...30000, and so on (ID
values in that range).
Here, we just want the data split up. How the data is split up is not relevant to our processing, so our definition looks like this:
ops$tkyte-ORA10G> create or replace
2 function parallel_pipelined( l_cursor in sys_refcursor )
3 return t2_tab_type
4 pipelined
5 parallel_enable ( partition l_cursor by any )
We'd like to be able to see what rows were processed by which parallel execution servers, so we'll declare a local variable L_SESSION_ID
and initialize it from V$MYSTAT
:
6
7 is
8 l_session_id number;
9 l_rec t1%rowtype;
10 begin
11 select sid into l_session_id
12 from v$mystat
13 where rownum =1;
Now we are ready to process the data. We simply fetch out a row (or rows, as we could certainly use BULK COLLECT
here to array process the ref cursor), perform our complex process on it, and pipe it out. When the ref cursor is exhausted of data, we close the cursor and return:
14 loop
15 fetch l_cursor into l_rec;
16 exit when l_cursor%notfound;
17 -- complex process here
18 pipe row(t2_type(l_rec.id,l_rec.text,l_session_id));
19 end loop;
20 close l_cursor;
21 return;
22 end;
23 /
Function created.
And that's it. We're ready to process the data in parallel, letting Oracle figure out based on the resources available what the most appropriate degree of parallelism is:
ops$tkyte-ORA10G> alter session enable parallel dml;
Session altered.
ops$tkyte-ORA10G> insert /*+ append */
2 into t2(id,text,session_id)
3 select *
4 from table(parallel_pipelined
5 (CURSOR(select /*+ parallel(t1) */ *
6 from t1 )
7 ))
8 /
48250 rows created.
ops$tkyte-ORA10G> commit;
Commit complete.
Just to see what happened here, we can query the newly inserted data out and group by SESSION_ID
to see first how many parallel execution servers were used, and second how many rows each processed:
ops$tkyte-ORA10G> select session_id, count(*)
2 from t2
3 group by session_id;
SESSION_ID COUNT(*)
---------- ----------
241 8040
246 8045
253 8042
254 8042
258 8040
260 8041
6 rows selected.
Apparently, we used six parallel execution servers for the SELECT
component of this parallel operation, and each one processed about 8,040 records each.
As you can see, Oracle parallelized our process, but we underwent a fairly radical rewrite of our process. This is a long way from the original implementation. So, while Oracle can process our routine in parallel, we may well not have any routines that are coded to be parallelized. If a rather large rewrite of your procedure is not feasible, you may be interested in the next implementation: DIY parallelism.
Say we have that same process as in the preceding section: the serial, simple procedure. We cannot afford a rather extensive rewrite of the implementation, but we would like to execute it in parallel. What can we do? My approach many times has been to use rowid ranges to break up the table into some number of ranges that don't overlap (yet completely cover the table).
This is very similar to how Oracle performs a parallel query conceptually. If you think of a full table scan, Oracle processes that by coming up with some method to break the table into many "small" tables, each of which is processed by a parallel execution server. We are going to do the same thing using rowid ranges. In early releases, Oracle's parallel implementation actually used rowid ranges itself.
We'll use a BIG_TABLE
of 1,000,000 rows, as this technique works best on big tables with lots of extents, and the method I use for creating rowid ranges depends on extent boundaries. The more extents used, the better the data distribution. So, after creating the BIG_TABLE
with 1,000,000 rows, we'll create T2
like this:
big_table-ORA10G> create table t2
2 as
3 select object_id id, object_name text, 0 session_id
4 from big_table
5 where 1=0;
Table created.
We are going to use the job queues built into the database to parallel process our procedure. We will schedule some number of jobs. Each job is our procedure slightly modified to just process the rows in a given rowid range.
Note In Oracle 10g, you could use the scheduler for something this simple, but in order to make the example 9i compatible, we'll use the job queues here.
To efficiently support the job queues, we'll use a parameter table to pass inputs to our jobs:
big_table-ORA10G> create table job_parms
2 ( job number primary key,
3 lo_rid rowid,
4 hi_rid rowid
5 )
6 /
Table created.
This will allow us to just pass the job ID into our procedure, so it can query this table to get the rowid range it is to process. Now for our procedure. The code in bold is the new code we'll be adding:
big_table-ORA10G> create or replace
2 procedure serial( p_job in number )
3 is
4 l_rec job_parms%rowtype;
5 begin
6 select * into l_rec
7 from job_parms
8 where job = p_job;
9
10 for x in ( select object_id id, object_name text
11 from big_table
12 where rowid between l_rec.lo_rid
13 and l_rec.hi_rid )
14 loop
15 -- complex process here
16 insert into t2 (id, text, session_id )
17 values ( x.id, x.text, p_job );
18 end loop;
19
20 delete from job_parms where job = p_job;
21 commit;
22 end;
23 /
Procedure created.
As you can see, it is not a significant change. Most of the added code was simply to get our inputs and the rowid range to process. The only change to our logic was the addition of the predicate on lines 12 and 13.
Now let's schedule our job. We'll use a rather complex query using analytics to divide the table. The innermost query on lines 19 through 26 breaks the data into eight groups in this case. The first sum on line 22 is computing a running total of the sum of blocks; the second sum on line 23 is the total number of blocks. If we integer divide the running total by the desired "chunk size" (the total size divided by 8 in this case), we can create groups of files/blocks that cover about the same amount of data. The query on lines 8 through 28 finds the high and low file numbers and block numbers by GRP
, and returns the distinct entries. It builds the inputs we can then send to DBMS_ROWID
to create the rowids Oracle wants. We take that output and, using DBMS_JOB
, submit a job to process the rowid range:
big_table-ORA10G> declare
2 l_job number;
3 begin
4 for x in (
5 select dbms_rowid.rowid_create
( 1, data_object_id, lo_fno, lo_block, 0 ) min_rid,
6 dbms_rowid.rowid_create
( 1, data_object_id, hi_fno, hi_block, 10000 ) max_rid
7 from (
8 select distinct grp,
9 first_value(relative_fno)
over (partition by grp order by relative_fno, block_id
10 rows between unbounded preceding and unbounded following) lo_fno,
11 first_value(block_id )
over (partition by grp order by relative_fno, block_id
12 rows between unbounded preceding and unbounded following) lo_block,
13 last_value(relative_fno)
over (partition by grp order by relative_fno, block_id
14 rows between unbounded preceding and unbounded following) hi_fno,
15 last_value(block_id+blocks-1)
over (partition by grp order by relative_fno, block_id
16 rows between unbounded preceding and unbounded following) hi_block,
17 sum(blocks) over (partition by grp) sum_blocks
18 from (
19 select relative_fno,
20 block_id,
21 blocks,
22 trunc( (sum(blocks) over (order by relative_fno, block_id)-0.01) /
23 (sum(blocks) over ()/8) ) grp
24 from dba_extents
25 where segment_name = upper('BIG_TABLE')
26 and owner = user order by block_id
27 )
28 ),
29 (select data_object_id
from user_objects where object_name = upper('BIG_TABLE') )
30 )
31 loop
32 dbms_job.submit( l_job, 'serial(JOB);' );
33 insert into job_parms(job, lo_rid, hi_rid)
34 values ( l_job, x.min_rid, x.max_rid );
35 end loop;
36 end;
37 /
PL/SQL procedure successfully completed.
That PL/SQL block would have scheduled up to eight jobs for us (fewer if the table could not be broken in to eight pieces due to insufficient extents or size). We can see how many jobs were scheduled and what their inputs are as follows:
big_table-ORA10G> select * from job_parms;
JOB LO_RID HI_RID
---------- ------------------ ------------------
172 AAAT7tAAEAAAAkpAAA AAAT7tAAEAAABQICcQ
173 AAAT7tAAEAAABQJAAA AAAT7tAAEAAABwICcQ
174 AAAT7tAAEAAABwJAAA AAAT7tAAEAAACUICcQ
175 AAAT7tAAEAAACUJAAA AAAT7tAAEAAAC0ICcQ
176 AAAT7tAAEAAAC0JAAA AAAT7tAAEAAADMICcQ
177 AAAT7tAAEAAADaJAAA AAAT7tAAEAAAD6ICcQ
178 AAAT7tAAEAAAD6JAAA AAAT7tAAEAAAEaICcQ
179 AAAT7tAAEAAAEaJAAA AAAT7tAAEAAAF4ICcQ
8 rows selected.
big_table-ORA10G> commit;
Commit complete.
That commit released our jobs for processing. We have JOB_QUEUE_PROCESSES
set to 0 in the parameter file, so all eight started running and shortly finished. The results are as follows:
big_table-ORA10G> select session_id, count(*)
2 from t2
3 group by session_id;
SESSION_ID COUNT(*)
---------- ----------
172 130055
173 130978
174 130925
175 129863
176 106154
177 140772
178 140778
179 90475
8 rows selected.
It's not as evenly distributed as the Oracle built-in parallelism in this case, but it's pretty good. If you recall, earlier you saw how many rows were processed by each parallel execution server and, using the built-in parallelism, the row counts were very close to each other (they were off only by one or two). Here we had a job that processed as few as 90,475 rows and one that processed as many as 140,778. Most of them processed about 130,000 rows in this case.
Suppose, however, that you do not want to use the rowid processing—perhaps the query is not as simple as SELECT * FROM T
and involves joins and other constructs that make using the rowid impractical. You can use the primary key of some table instead. For example, say you want to break that same BIG_TABLE
into ten pieces to be processed concurrently by primary key. You can do that easily using the NTILE
built-in analytic function. The process is rather straightforward:
big_table-ORA10G> select nt, min(id), max(id), count(*)
2 from (
3 select id, ntile(10) over (order by id) nt
4 from big_table
5 )
6 group by nt;
NT MIN(ID) MAX(ID) COUNT(*)
---------- ---------- ---------- ----------
1 1 100000 100000
2 100001 200000 100000
3 200001 300000 100000
4 300001 400000 100000
5 400001 500000 100000
6 500001 600000 100000
7 600001 700000 100000
8 700001 800000 100000
9 800001 900000 100000
10 900001 1000000 100000
10 rows selected.
Now you have ten nonoverlapping primary key ranges—all of nice, equal size—that you can use to implement the same DBMS_JOB
technique as shown earlier to parallelize your process.
In this chapter, we explored the concept of parallel execution in Oracle. I started by presenting an analogy to help frame where and when parallel execution is applicable, namely when we have long-running statements or procedures and plenty of available resources.
Then we looked at how Oracle can employ parallelism. We started with parallel query and how Oracle can break large serial operations, such as a full scan, into smaller pieces that can run concurrently. We moved on to parallel DML (PDML) and covered the rather extensive list of restrictions that accompany it.
Then we looked at the sweet spot for parallel operations: parallel DDL. Parallel DDL is a tool for the DBA and developer alike to quickly perform those large maintenance operations typically done during off-peak times when resources are available. We briefly touched on the fact that Oracle provides parallel recovery before we moved on to discuss procedural parallelism. Here we saw two techniques for parallelizing our procedures: one where Oracle does it and the other where we do it ourselves.
If we're designing a process from scratch, we might well consider designing it to allow Oracle to parallelize it for us, as the future addition or reduction of resources would easily permit the degree of parallelism to vary. However, if we have existing code that needs to quickly be "fixed" to be parallel, we may opt for do-it-yourself (DIY) parallelism, which we covered by examining two techniques, rowid ranges and primary key ranges, both of which use DBMS_JOB
to carry out the job in parallel in the background for us.
3.144.9.141