© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
D. Kuhn, T. KyteExpert Oracle Database Architecturehttps://doi.org/10.1007/978-1-4842-7499-6_9

9. Redo and Undo

Darl Kuhn1   and Thomas Kyte2
(1)
Morrison, CO, USA
(2)
Denver, CO, USA
 

This chapter introduces you two of the most important pieces of data in an Oracle database: redo and undo. Redo is the information Oracle records in online (and archived) redo log files in order to “replay” your transaction in the event of a failure. Undo is the information Oracle records in the undo segments in order to reverse, or roll back, your transaction.

In this chapter, we will discuss topics such as how redo and undo (rollback) are generated and how they fit into transactions. We’ll also discuss the performance implications of COMMIT and ROLLBACK statements. I will present the pseudo-code for these mechanisms in Oracle and a conceptual explanation of what actually takes place. I will not cover every internal detail of what files get updated with what bytes of data. What actually takes place is a little more involved, but having a good understanding of the flow of how it works is valuable and will help you to understand the ramifications of your actions.

Time and time again, I get questions regarding the exact bits and bytes of redo and undo. People seem to want to have a very detailed specification of exactly, precisely, what is in there. I never answer those questions. Instead, I focus on the intent of redo and undo, the concepts behind redo and undo. I focus on the use of redo and undo—not on the bits and bytes. I myself do not “dump” redo log files or undo segments. I do use the supplied tools, such as LogMiner to read redo and flashback transaction history to read undo, but that presents the information to me in a human-readable format. So, we won’t be doing internals in this chapter but rather building a strong foundation.

What Is Redo?

Redo log files are crucial to the Oracle database. These are the transaction logs for the database. Oracle maintains two types of redo log files: online and archived . They are used for recovery purposes; their main purpose in life is to be used in the event of an instance or media failure.

If the power goes off on your database machine, causing an instance failure, Oracle will use the online redo logs to restore the system to exactly the committed point it was at immediately prior to the power outage. If your disk drive fails (a media failure), Oracle will use both archived redo logs and online redo logs to recover a backup of the data that was on that drive to the correct point in time. Moreover, if you “accidentally” truncate a table or remove some critical information and commit the operation, you can restore a backup of the affected data and recover it to the point in time immediately prior to the “accident” using online and archived redo log files.

I should point out that modern versions of Oracle also have flashback technology. This allows us to perform flashback queries (query the data as of some point in time in the past), undrop a database table, put a table back the way it was some time ago, and so on. As a result, the number of occasions in which we need to perform a conventional recovery (using database backups and archived redo logs) has decreased. However, the ability to perform a recovery is the DBA’s most important job.

Note Database restore and recovery is the one thing a DBA is not allowed to get wrong.

Archived redo log files are simply copies of old, full online redo log files. As the system fills up log files, the Oracle archiver (ARCn) process makes a copy of the online redo log file in another location and optionally puts several other copies into local and remote locations as well. These archived redo log files are used to perform media recovery when a failure is caused by a disk drive going bad or some other physical fault. Oracle can take these archived redo log files and apply them to backups of the datafiles to catch them up to the rest of the database. They are the transaction history of the database.

Every Oracle database has at least two online redo log groups with at least a single member (redo log file) in each group. These online redo log groups are written to in a circular fashion by the log writer (LGWR) background process. Oracle will write to the log files in group 1, and when it gets to the end of the files in group 1, it will switch to log file group 2 and begin writing to that one. When it has filled log file group 2, it will switch back to log file group 1 (assuming you have only two redo log file groups; if you have three, Oracle would, of course, proceed to the third group).

Redo logs, or transaction logs, are one of the major features that make a database a database. They are perhaps its most important recovery structure, although without the other pieces such as undo segments, distributed transaction recovery, and so on, nothing works. They are a major component of what sets a database apart from a conventional file system. The online redo logs allow us to effectively recover from a power outage—one that might happen while Oracle’s database writer (DBWR) background process is in the middle of writing to disk. The archived redo logs let us recover from media failures when, for instance, the hard disk goes bad or human error causes data loss. Without redo logs, the database would not offer any more protection than a file system.

There’s one additional item I want to mention regarding redo. In an Oracle RAC environment, you typically have two or more instances. RAC configurations have one common set of datafiles (meaning each instance transacts against a common set of datafiles). However, each instance participating in a RAC cluster has its own memory structures and background processes (e.g., log writer and archiver). Also, each instance will have its own redo stream (or often called a thread of redo). And it follows that each instance will also have its own undo segments. This is important because you may find yourself troubleshooting performing issues with redo, and it’s critical to pinpoint which instance or instances may be having redo bottleneck issues.

What Is Undo?

Undo is conceptually the opposite of redo. Undo information is generated by the database as you make modifications to data so that the data can be put back the way it was before the modifications took place. This might be done in support of multiversioning, or in the event the transaction or statement you are executing fails for any reason, or if we request it with a ROLLBACK statement. Whereas redo is used to replay a transaction in the event of failure—to recover the transaction—undo is used to reverse the effects of a statement or set of statements. Undo, unlike redo, is stored internally in the database in a special set of segments known as undo segments.

Note

“Rollback segment” and “undo segment” are considered synonymous terms.

It is a common misconception that undo is used to restore the database physically to the way it was before the statement or transaction executed, but this is not so. The database is logically restored to the way it was—any changes are logically undone—but the data structures, the database blocks themselves, may well be different after a rollback. The reason for this lies in the fact that, in any multiuser system, there will be tens or hundreds or thousands of concurrent transactions. One of the primary functions of a database is to mediate concurrent access to its data. The blocks that our transaction modifies are, in general, being modified by many other transactions as well. Therefore, we can’t just put a block back exactly the way it was at the start of our transaction—that could undo someone else’s work!

For example, suppose our transaction executed an INSERT statement that caused the allocation of a new extent (i.e., it caused the table to grow). Our INSERT would cause us to get a new block, format it for use, and put some data into it. At that point, some other transaction might come along and insert data into this block. If we roll back our transaction, obviously we can’t unformat and unallocate this block. Therefore, when Oracle rolls back, it is really doing the logical equivalent of the opposite of what we did in the first place. For every INSERT, Oracle will do a DELETE. For every DELETE, Oracle will do an INSERT. For every UPDATE, Oracle will do an “anti-UPDATE,” or an UPDATE that puts the row back the way it was prior to our modification.

Note

This undo generation is not true for direct path operations, which have the ability to bypass undo generation on the table. We’ll discuss these operations in more detail shortly.

How can we see this in action? Perhaps the easiest way is to follow these steps:
  1. 1.

    Create an empty table.

     
  2. 2.

    Full scan the table and observe the amount of I/O performed to read it.

     
  3. 3.

    Fill the table with many rows (no commit).

     
  4. 4.

    Roll back that work and undo it.

     
  5. 5.

    Full scan the table a second time and observe the amount of I/O performed.

     
So, let’s create an empty table:
$ sqlplus eoda/foo@PDB1
SQL> create table t as select * from all_objects where 1=0;
Table created.

And now we’ll query it, with AUTOTRACE enabled in SQL*Plus to measure the I/O.

Note

In this example, we will full scan the table twice each time. The goal is to only measure the I/O performed the second time in each case. This avoids counting additional I/Os performed by the optimizer during any parsing and optimization that may occur.

The query initially takes no I/Os to full scan the table:
SQL> select * from t;
no rows selected
SQL> set autotrace traceonly statistics
SQL> select * from t;
no rows selected
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
SQL> set autotrace off

Now, that might surprise you at first—that there are zero I/Os against the table. This is due to deferred segment creation.

Note

The deferred segment creation feature is available only with the Enterprise Edition of Oracle. This feature is enabled by default. You can override this default behavior when creating the table.

Next, we’ll add lots of data to the table. We’ll make it “grow,” then roll it all back:
SQL> insert into t select * from all_objects;
72516 rows created.
SQL> rollback;
Rollback complete.
Now, if we query the table again, we’ll discover that it takes considerably more I/Os to read the table this time:
SQL> select * from t;
no rows selected
SQL> set autotrace traceonly statistics
SQL> select * from t;
no rows selected
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1451  consistent gets
          0  physical reads
SQL> set autotrace off
The blocks that our INSERT caused to be added under the table’s high-water mark (HWM) are still there—formatted, but empty. Our full scan had to read them to see if they contained any rows. Moreover, the first time we ran the query, we observed zero I/Os. That was due to the default mode of table creation using deferred segment creation. When we issued that CREATE TABLE, no storage, not a single extent, was allocated. The segment creation was deferred until the INSERT took place, and when we rolled back, the segment persisted. You can see this easily with a smaller example; I’ll explicitly request deferred segment creation this time although it is enabled by default:
SQL> drop table t purge;
Table dropped.
SQL> create table t ( x int )  segment creation deferred;
Table created.
SQL> select extent_id, bytes, blocks
       from user_extents
      where segment_name = 'T'
      order by extent_id;
no rows selected
SQL> insert into t(x) values (1);
1 row created.
SQL> rollback;
Rollback complete.
SQL> select extent_id, bytes, blocks
       from user_extents
      where segment_name = 'T'
      order by extent_id;
 EXTENT_ID      BYTES     BLOCKS
---------- ---------- ----------
         0      65536          8

As you can see, after the table was initially created there was no allocated storage—no extents were used by this table. Upon performing an INSERT, followed immediately by ROLLBACK, we can see the INSERT allocated storage—but the ROLLBACK does not “release” it. Those two things together—that the segment was actually created by the INSERT but not “uncreated” by the ROLLBACK, and that the new formatted blocks created by the INSERT were scanned the second time around—show that a rollback is a logical “put the database back the way it was” operation. The database will not be exactly the way it was, just logically the same.

How Redo and Undo Work Together

Now let’s take a look at how redo and undo work together in various scenarios. We will discuss, for example, what happens during the processing of an INSERT with regard to redo and undo generation and how Oracle uses this information in the event of failures at various points in time.

An interesting point to note is that undo information, stored in undo tablespaces or undo segments, is protected by redo as well. In other words, undo data is treated just like table data or index data—changes to undo generates some redo, which is logged (to the log buffer and then the redo log file). Why this is so will become clear in a moment when we discuss what happens when a system crashes. Undo data is added to the undo segment and is cached in the buffer cache, just like any other piece of data would be.

Example INSERT-UPDATE-DELETE-COMMIT Scenario

For this example, assume we’ve created a table with an index as follows:
create table t(x int, y int);
create index ti on t(x);
And then we will investigate what might happen with a set of statements like this:
insert into t (x,y) values  (1,1);
update t set x = x+1 where x = 1;
delete from t where x = 2;
We will follow this transaction down different paths and discover the answers to the following questions:
  • What happens if the system fails at various points in the processing of these statements?

  • What happens if the buffer cache fills up?

  • What happens if we ROLLBACK at any point?

  • What happens if we succeed and COMMIT?

The INSERT

The initial INSERT INTO T statement will generate both redo and undo. The undo generated will be enough information to make the INSERT “go away.” The redo generated by the INSERT INTO T will be enough information to make the INSERT “happen again.”

After the INSERT has occurred, we have the scenario illustrated in Figure 9-1.
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig1_HTML.jpg
Figure 9-1

State of the system after an INSERT

There are some cached, modified undo blocks, index blocks, and table data blocks. Each of these blocks is protected by entries in the redo log buffer.

Hypothetical Scenario: The System Crashes Right Now

In this scenario, the system crashes before a COMMIT is issued or before the redo entries are written to disk. Everything is OK. The SGA memory area is wiped out, but we don’t need anything that was in the SGA. It will be as if this transaction never happened when we restart. None of the blocks with changes got flushed to disk, and none of the redo got flushed to disk. We have no need of any of this undo or redo to recover from an instance failure.

Hypothetical Scenario: The Buffer Cache Fills Up Right Now

The situation is such that DBWn must make room and our modified blocks are to be flushed from the cache. In this case, DBWn will start by asking LGWR to flush the redo entries that protect these database blocks. Before DBWn can write any of the blocks that are changed to disk, LGWR must flush (to disk) the redo information related to these blocks. This makes sense: if we were to flush the modified blocks for table T (but not the undo blocks associated with the modifications) without flushing the redo entries associated with the undo blocks, and the system failed, we would have a modified table T block with no undo information associated with it. We need to flush the redo log buffers before writing these blocks out so that we can redo all of the changes necessary to get the SGA back into the state it is in right now, so that a rollback can take place.

This second scenario shows some of the foresight that has gone into all of this. The set of conditions described by “If we flushed table T blocks and did not flush the redo for the undo blocks and the system failed” is starting to get complex. It only gets more complex as we add users, more objects, concurrent processing, and so on.

At this point, we have the situation depicted in Figure 9-1. We have generated some modified table and index blocks. These have associated undo segment blocks, and all three types of blocks have generated redo to protect them. The redo log buffer is flushed at least every three seconds, when it is one-third full or contains 1MB of buffered data, or whenever a COMMIT or ROLLBACK takes place. It is very possible that at some point during our processing, the redo log buffer will be flushed. In that case, the picture will look like Figure 9-2.
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig2_HTML.jpg
Figure 9-2

State of the system after a redo log buffer flush

That is, we’ll have modified blocks representing uncommitted changes in the buffer cache and redo for those uncommitted changes on disk. This is a very normal scenario that happens frequently.

The UPDATE

The UPDATE will cause much of the same work as the INSERT to take place. This time, the amount of undo will be larger; we have some “before” images to save as a result of the UPDATE. Now we have the picture shown in Figure 9-3 (the dark rectangle in the redo log file represents the redo generated by the INSERT; the redo for the UPDATE is still in the SGA and has not yet been written to disk).
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig3_HTML.jpg
Figure 9-3

State of the system after the UPDATE

We have more new undo segment blocks in the block buffer cache. To undo the UPDATE, if necessary, we have modified database table and index blocks in the cache. We have also generated more redo log buffer entries. Let’s assume that our redo generated from the INSERT statement (discussed in the prior section) is on disk (in the redo log file) and redo generated from the UPDATE is in cache.

Hypothetical Scenario: The System Crashes Right Now

Upon startup, Oracle would read the redo log files and find some redo log entries for our transaction. Given the state in which we left the system, we have the redo entries generated by the INSERT in the redo log files (which includes redo for undo segments associated with the INSERT). However, the redo for the UPDATE was only in the log buffer and never made it to disk (and was wiped out when the system crashed). That’s OK; the transaction was never committed, and the datafiles on disk reflect the state of the system before the UPDATE took place.

However, the redo for the INSERT was written to the redo log file. Therefore, Oracle would “roll forward” the INSERT. We would end up with a picture much like Figure 9-1, with modified undo blocks (information on how to undo the INSERT), modified table blocks (right after the INSERT), and modified index blocks (right after the INSERT). Oracle will discover that our transaction never committed and will roll it back since the system is doing crash recovery and, of course, our session is no longer connected.

To roll back the uncommitted INSERT, Oracle will use the undo it just rolled forward (from the redo and now in the buffer cache) and apply it to the data and index blocks, making them look as they did before the INSERT took place. Now everything is back the way it was. The blocks that are on disk may or may not reflect the INSERT (it depends on whether or not our blocks got flushed before the crash). If the blocks on disk do reflect the INSERT, then the INSERT will be undone when the blocks are flushed from the buffer cache. If they do not reflect the undone INSERT, so be it—they will be overwritten later anyway.

This scenario covers the rudimentary details of a crash recovery. The system performs this as a two-step process. First, it rolls forward, bringing the system right to the point of failure, and then it proceeds to roll back everything that had not yet committed. This action will resynchronize the datafiles. It replays the work that was in progress and undoes anything that has not yet completed.

Hypothetical Scenario: The Application Rolls Back the Transaction

At this point, Oracle will find the undo information for this transaction either in the cached undo segment blocks (most likely) or on disk if they have been flushed (more likely for very large transactions). It will apply the undo information to the data and index blocks in the buffer cache, or if they are no longer in the cache request, they are read from disk into the cache to have the undo applied to them. These blocks will later be flushed to the datafiles with their original row values restored.

This scenario is much more common than the system crash. It is useful to note that during the rollback process, the redo logs are never involved. The only time redo logs are read for recovery purposes is during recovery and archival. This is a key tuning concept: redo logs are written to. Oracle does not read them during normal processing. As long as you have sufficient devices so that when ARCn is reading a file, LGWR is writing to a different device, there is no contention for redo logs. Many other databases treat the log files as “transaction logs.” They do not have this separation of redo and undo. For those systems, the act of rolling back can be disastrous—the rollback process must read the logs their log writer is trying to write to. They introduce contention into the part of the system that can least stand it. Oracle’s goal is to make it so that redo logs are written sequentially, and no one ever reads them while they are being written.

The DELETE

Again, undo is generated as a result of the DELETE , blocks are modified, and redo is sent over to the redo log buffer. This is not very different from before. In fact, it is so similar to the UPDATE that we are going to move right on to the COMMIT.

The COMMIT

We’ve looked at various failure scenarios and different paths, and now we’ve finally made it to the COMMIT . Here, Oracle will flush the redo log buffer to disk, and the picture will look like Figure 9-4.
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig4_HTML.jpg
Figure 9-4

State of the system after a COMMIT

The modified blocks are in the buffer cache; maybe some of them have been flushed to disk. All of the redo necessary to replay this transaction is safely on disk, and the changes are now permanent. If we were to read the data directly from the datafiles, we probably would see the blocks as they existed before the transaction took place, as DBWn most likely has not yet written them. That’s OK—the redo log files can be used to bring those blocks up to date in the event of a failure. The undo information will hang around until the undo segment wraps around and reuses those blocks. Oracle will use that undo to provide for consistent reads of the affected objects for any session that needs them.

Commit and Rollback Processing

It is important to understand how redo log files might impact us as developers. We will look at how the different ways we can write our code affect redo log utilization. We’ve already seen the mechanics of redo earlier in the chapter, and now we’ll look at some specific issues. You might detect many of these scenarios, but they would be fixed by the DBA as they affect the database instance as a whole. We’ll start with what happens during a COMMIT and then get into commonly asked questions and issues surrounding the online redo logs.

What Does a COMMIT Do?

As a developer, you should have a good understanding of exactly what goes on during a COMMIT. In this section, we’ll investigate what happens during the processing of the COMMIT statement in Oracle. A COMMIT is generally a very fast operation, regardless of the transaction size. You might think that the bigger a transaction (in other words, the more data it affects), the longer a COMMIT would take. This is not true. The response time of a COMMIT is generally “flat,” regardless of the transaction size. This is because a COMMIT does not really have too much work to do, but what it does do is vital.

One of the reasons this is an important fact to understand and embrace is that it will lead to letting your transactions be as big as they should be. As we discussed in the previous chapter, many developers artificially constrain the size of their transactions, committing every so many rows, instead of committing when a logical unit of work has been performed. They do this in the mistaken belief that they are preserving scarce system resources, when in fact they are increasing them. If a COMMIT of one row takes X units of time, and the COMMIT of 1000 rows takes the same X units of time, then performing work in a manner that does 1000 one-row COMMITs will take an additional 1000*X units of time to perform. By committing only when you have to (when the logical unit of work is complete), you will not only increase performance, you’ll also reduce contention for shared resources (log files, various internal latches, and the like). A simple example demonstrates that it necessarily takes longer. We’ll use a Java application, although you can expect similar results from most any client—except, in this case, PL/SQL (we’ll discuss why that is after the example). To start, here is the sample table we’ll be inserting into:
$ sqlplus eoda/foo@PDB1
SQL> create table test
     ( id          number,
       code        varchar2(20),
       descr       varchar2(20),
       insert_user varchar2(30),
       insert_date date);
Table created.
Our Java program (stored in a file named perftest.java) will accept two inputs: the number of rows to INSERT (iters) and how many rows between commits (commitCnt). It starts by connecting to the database, setting autocommit off (which should be done in all Java code), and then calling a doInserts() method a total of two times:
  • Once just to warm up the routine (make sure all of the classes are loaded)

  • A second time, with SQL Tracing on, specifying the number of rows to INSERT along with how many rows to commit at a time (i.e., commit every N rows)

It then closes the connection and exits. The main method is as follows (you’ll have to modify the connect string for your environment):
import java.sql.*;
import java.sql.DriverManager;
import java.sql.Connection;
import java.sql.SQLException;
import java.io.*;
public class perftest
{
  public static void main (String arr[]) throws Exception
  {
    DriverManager.registerDriver(new oracle.jdbc.OracleDriver());
    Connection con = DriverManager.getConnection
                                   ("jdbc:oracle:thin:@//localhost.localdomain:1521/PDB1", "eoda", "foo");
    Integer iters = new Integer(arr[0]);
    Integer commitCnt = new Integer(arr[1]);
    con.setAutoCommit(false);
    doInserts( con, 1, 1 );
    Statement stmt = con.createStatement ();
        stmt.execute( "begin dbms_monitor.session_trace_enable(waits=>true); end;" );
    doInserts( con, iters.intValue(), commitCnt.intValue() );
    con.close();
  }
Note

The SCOTT account or whatever account you use to test this with will need to have the EXECUTE privilege granted on the DBMS_MONITOR package.

Now, the method doInserts() is fairly straightforward. It starts by preparing (parsing) an INSERT statement so we can repeatedly bind/execute it over and over:
static void doInserts(Connection con, int count, int commitCount )
  throws Exception
  {
    PreparedStatement ps =
       con.prepareStatement
       ("insert into test " +
        "(id, code, descr, insert_user, insert_date)"
        + " values (?,?,?, user, sysdate)");
It then loops over the number of rows to insert, binding and executing the INSERT over and over. Additionally, it checks a row counter to see if it needs to COMMIT or not inside the loop:
    int  rowcnt = 0;
    int  committed = 0;
    for (int i = 0; i < count; i++ )
    {
      ps.setInt(1,i);
      ps.setString(2,"PS - code" + i);
      ps.setString(3,"PS - desc" + i);
      ps.executeUpdate();
      rowcnt++;
      if ( rowcnt == commitCount )
      {
        con.commit();
        rowcnt = 0;
        committed++;
      }
    }
    con.commit();
    System.out.println
    ("pstatement rows/commitcnt = " + count + " / " +  committed );
  }
}
Tip

See the Oracle Database JDBC Developer’s Guide for further details on how to connect to an Oracle database with Java.

Before compiling the Java code, we need to set our CLASSPATH variable (this should all go on one line, no spaces, but doesn’t fit within the space on the page here):
$ export CLASSPATH=$CLASSPATH:$ORACLE_HOME/jdbc/lib/ojdbc8.jar:
$ORACLE_HOME/jlib/orai18n.jar
Also ensure that the Java executables are in your PATH variable (adjust this per your ORACLE_HOME):
$ export PATH=$PATH:/opt/oracle/product/19c/dbhome_1/jdk/bin
Next, we compile the Java code from the OS command line:
$ javac perftest.java
Now we’ll run this code repeatedly with different inputs and review the resulting TKPROF file. We’ll run with 100,000 row inserts—committing 1 row at a time, then 10, and so on. Here’s an example of running the first test from the OS command line:
$ java perftest 100000 1
The prior code does the inserts and also generates a trace file. You can locate the directory the trace file is in via this query:
SQL> select value from v$diag_info where name='Diag Trace';
VALUE
---------------------------------------------------------------------------
/opt/oracle/diag/rdbms/cdb/CDB/trace
Navigate to your trace directory and look for a recently generated trace file in that directory with the string of “insert into test”:
$ grep "insert into test" *.trc
CDB_ora_7660.trc:insert into test (id, code, descr, insert_user, insert_date)...
Process the file using the TKPROF utility (this creates a human-readable output file):
$ tkprof CDB_ora_7660.trc output.txt sys=no
After running the prior command, the file output.txt has the human-readable performance output of the insert test. The resulting TKPROF files produced the results in Table 9-1.
Table 9-1

Results from Inserting 100,000 Rows

Number of Rows to Insert

Commit Every N Rows, N=

CPU for Insert Statement (Seconds)

Wait Time for Log File Sync (Seconds)

100,000

1

1.42

34.87

100,000

10

1.35

4.05

100,000

100

1.63

0.61

100,000

1000

1.65

0.21

100,000

10,000

1.66

0.04

100,000

100,000

1.65

0.00

As you can see, the more often you commit, the longer you wait (your mileage will vary on this). And the amount of time you wait is more or less directly proportional to the number of times you commit. Remember, this is just a single-user scenario; with multiple users doing the same work, all committing too frequently, the numbers will go up rapidly.

We’ve heard the same story, time and time again, with similar situations. For example, we’ve seen how not using bind variables and performing hard parses often severely reduces concurrency due to library cache contention and excessive CPU utilization. Even when we switch to using bind variables, soft parsing too frequently—caused by closing cursors even though we are going to reuse them shortly—incurs massive overhead. We must perform operations only when we need to—a COMMIT is just another such operation. It is best to size our transactions based on business need, not based on misguided attempts to lessen resource usage on the database.

There are two factors contributing to the expense of the COMMIT in this example:
  • We’ve obviously increased the round trips to and from the database. If we commit every record, we are generating that much more traffic back and forth. I didn’t even measure that, which would add to the overall runtime.

  • Every time we commit, we must wait for our redo to be written to disk. This will result in a “wait.” In this case, the wait is named “log file sync.”

So, we committed after every INSERT; we waited every time for a short period of time—and if you wait a little bit of time but you wait often, it all adds up. Fully 30 seconds of our runtime was spent waiting for a COMMIT to complete when we committed 100,000 times—in other words, waiting for LGWR to write the redo to disk. In stark contrast, when we committed once, we didn’t wait very long (not a measurable amount of time actually). This proves that a COMMIT is a fast operation; we expect the response time to be more or less flat, not a function of the amount of work we’ve done.

So, why is a COMMIT’s response time fairly flat, regardless of the transaction size? It is because before we even go to COMMIT in the database, we’ve already done the really hard work. We’ve already modified the data in the database, so we’ve already done 99.9 percent of the work. For example, operations such as the following have already taken place:
  • Undo blocks have been generated in the SGA.

  • Modified data blocks have been generated in the SGA.

  • Buffered redo for the preceding two items has been generated in the SGA.

  • Depending on the size of the preceding three items and the amount of time spent, some combination of the previous data may be flushed onto disk already.

  • All locks have been acquired.

When we COMMIT, all that is left to happen is the following:
  • A System Change Number (SCN) is generated for our transaction. In case you are not familiar with it, the SCN is a simple timing mechanism Oracle uses to guarantee the ordering of transactions and to enable recovery from failure. It is also used to guarantee read consistency and checkpointing in the database. Think of the SCN as a ticker; every time someone COMMITs, the SCN is incremented by one.

  • LGWR writes all of our remaining buffered redo log entries to disk and records the SCN in the online redo log files as well. This step is actually the COMMIT. If this step occurs, we have committed. Our transaction entry is “removed” from V$TRANSACTION—this shows that we have committed.

  • All locks recorded in V$LOCK held by our session are released, and everyone who was enqueued waiting on locks we held will be woken up and allowed to proceed with their work.

  • Some of the blocks our transaction modified will be visited and “cleaned out” in a fast mode if they are still in the buffer cache. Block cleanout refers to the lock-related information we store in the database block header. Basically, we are cleaning out our transaction information on the block, so the next person who visits the block won’t have to. We are doing this in a way that need not generate redo log information, saving considerable work later (this is discussed fully later in this chapter).

As you can see, there is very little to do to process a COMMIT. The lengthiest operation is, and always will be, the activity performed by LGWR, as this is physical disk I/O. The amount of time spent by LGWR here will be greatly reduced by the fact that it has already been flushing the contents of the redo log buffer on a recurring basis. LGWR will not buffer all of the work you do for as long as you do it. Rather, it will incrementally flush the contents of the redo log buffer in the background as you are going along. This is to avoid having a COMMIT wait for a very long time in order to flush all of your redo at once.

So, even if we have a long-running transaction, much of the buffered redo log it generates would have been flushed to disk, prior to committing. On the flip side is the fact that when we COMMIT, we must typically wait until all buffered redo that has not been written yet is safely on disk. That is, our call to LGWR is by default a synchronous one. While LGWR may use asynchronous I/O to write in parallel to our log files, our transaction will normally wait for LGWR to complete all writes and receive confirmation that the data exists on disk before returning.

Note

Oracle 11g and above have an asynchronous wait. However, that style of commit has limited general-purpose use. Commits in any end user–facing application should be synchronous.

Now, earlier I mentioned that we were using a Java program and not PL/SQL for a reason—and that reason is a PL/SQL commit-time optimization. I said that our call to LGWR is by default a synchronous one and that we wait for it to complete its write. That is true in every version of the Oracle database for every programmatic language except PL/SQL. The PL/SQL engine, realizing that the client does not know whether or not a COMMIT has happened in the PL/SQL routine until the PL/SQL routine is completed, does an asynchronous commit. It does not wait for LGWR to complete; rather, it returns from the COMMIT call immediately. However, when the PL/SQL routine is completed, when we return from the database to the client, the PL/SQL routine will wait for LGWR to complete any of the outstanding COMMITs. So, if you commit 100 times in PL/SQL and then return to the client, you will likely find you waited for LGWR once—not 100 times—due to this optimization. Does this imply that committing frequently in PL/SQL is a good or OK idea? No, not at all—just that it is not as bad an idea as it is in other languages. The guiding rule is to commit when your logical unit of work is complete—not before.

Note

This commit-time optimization in PL/SQL may be suspended when you are performing distributed transactions or Data Guard in maximum availability mode. Since there are two participants, PL/SQL must wait for the commit to actually be complete before continuing. Also, it can be suspended by directly invoking COMMIT WORK WRITE WAIT in PL/SQL.

To demonstrate that a COMMIT is a “flat response time” operation, we’ll generate varying amounts of redo and time the INSERTs and COMMITs . As we do these INSERTs and COMMITs, we’ll measure the amount of redo our session generates using this small utility function:
SQL> create or replace function get_stat_val( p_name in varchar2 ) return number
     as
          l_val number;
     begin
         select b.value
           into l_val
           from v$statname a, v$mystat b
          where a.statistic# = b.statistic#
            and a.name = p_name;
        return l_val;
    end;
  /
Function created.
Note

The owner of the previous function will need to have been directly granted the SELECT privilege on the V$ views V_$STATNAME and V_$MYSTAT.

Drop the table T (if it exists) and create an empty table T of the same structure as BIG_TABLE:
SQL> drop table t purge;
SQL> create table t
     as
     select *
      from big_table
      where 1=0;
Table created.
Note

Directions on how to create and populate the BIG_TABLE table used in many examples are in the “Setting Up Your Environment” section at the very front of this book.

And we’ll measure the CPU and elapsed time used to commit our transaction using the DBMS_UTILITY package routines GET_CPU_TIME and GET_TIME. The actual PL/SQL block used to generate the workload and report on it is
SQL> set serverout on
SQL> declare
         l_redo number;
         l_cpu  number;
         l_ela  number;
     begin
         dbms_output.put_line
         ( '-' || '      Rows' || '        Redo' ||
           '     CPU' || ' Elapsed' );
         for i in 1 .. 6
        loop
            l_redo := get_stat_val( 'redo size' );
            insert into t select * from big_table  where rownum <= power(10,i);
            l_cpu  := dbms_utility.get_cpu_time;
            l_ela  := dbms_utility.get_time;
            commit work write wait;
            dbms_output.put_line
            ( '-' ||
              to_char( power( 10, i ), '9,999,999') ||
              to_char( (get_stat_val('redo size')-l_redo), '999,999,999' ) ||
              to_char( (dbms_utility.get_cpu_time-l_cpu), '999,999' ) ||
              to_char( (dbms_utility.get_time-l_ela), '999,999' ) );
        end loop;
    end;
  /
-      Rows        Redo     CPU Elapsed
-        10       6,552       2      17
-       100      10,336       0       5
-     1,000     114,684       0       8
-    10,000   1,156,452       0      25
-   100,000  13,184,820       1      28
- 1,000,000  67,356,624       2      58
PL/SQL procedure successfully completed.
Note

Times are in hundredths of seconds. Your results may vary depending on variables such as the number of records in BIG_TABLE, size of your log buffer, size and number of redo logs, number of log writer processes, and I/O subsystem.

As you can see, as we generate varying amount of redo from about 6500 bytes to 67MB, the difference in time to COMMIT is not measurable using a timer with a one-hundredth of a second resolution. As we were processing and generating the redo log, LGWR was constantly flushing our buffered redo information to disk in the background. So, when we generated 67MB of redo log information, LGWR was busy flushing every 1MB, or so. When it came to the COMMIT, there wasn’t much left to do—not much more than when we created ten rows of data. You should expect to see similar (but not exactly the same) results, regardless of the amount of redo generated.

What Does a ROLLBACK Do?

By changing the COMMIT to ROLLBACK , we can expect a totally different result. The time to roll back is definitely a function of the amount of data modified. I changed the script developed in the previous section to perform a ROLLBACK instead (simply change the COMMIT to ROLLBACK), and the timings are very different. Look at the results now:
$ sqlplus eoda/foo@PDB1
SQL> set serverout on
SQL> declare
         l_redo number;
         l_cpu  number;
         l_ela  number;
     begin
         dbms_output.put_line
         ( '-' || '      Rows' || '        Redo' ||
           '     CPU' || ' Elapsed' );
         for i in 1 .. 6
        loop
            l_redo := get_stat_val( 'redo size' );
            insert into t select * from big_table where rownum <= power(10,i);
            l_cpu  := dbms_utility.get_cpu_time;
            l_ela  := dbms_utility.get_time;
             --commit work write wait;
             rollback;
            dbms_output.put_line
            ( '-' ||
              to_char( power( 10, i ), '9,999,999') ||
              to_char( (get_stat_val('redo size')-l_redo), '999,999,999' ) ||
              to_char( (dbms_utility.get_cpu_time-l_cpu), '999,999' ) ||
              to_char( (dbms_utility.get_time-l_ela), '999,999' ) );
        end loop;
    end;
    /
-      Rows        Redo     CPU Elapsed
-        10       6,672       0       1
-       100      10,884       1       1
-     1,000     122,840       1       0
-    10,000   1,239,080       1       2
-   100,000  14,098,264       7      92
- 1,000,000  71,917,008      36     121
PL/SQL procedure successfully completed.
This difference in CPU and elapsed timings is to be expected, as a ROLLBACK has to undo the work we’ve done. Similar to a COMMIT, a series of operations must be performed. Before we even get to the ROLLBACK, the database has already done a lot of work. To recap, the following would have happened:
  • Undo segment records have been generated in the SGA.

  • Modified data blocks have been generated in the SGA.

  • A buffered redo log for the preceding two items has been generated in the SGA.

  • Depending on the size of the preceding three items and the amount of time spent, some combination of the previous data may be flushed onto disk already.

  • All locks have been acquired.

When we ROLLBACK
  • We undo all of the changes made. This is accomplished by reading the data back from the undo segment and, in effect, reversing our operation and then marking the undo entry as applied. If we inserted a row, a ROLLBACK will delete it. If we updated a row, a rollback will reverse the update. If we deleted a row, a rollback will reinsert it again.

  • All locks held by our session are released, and everyone who was enqueued waiting on locks we held will be released.

A COMMIT, on the other hand, just flushes any remaining data in the redo log buffers. It does very little work compared to a ROLLBACK. The point here is that you don’t want to roll back unless you have to. It is expensive since you spend a lot of time doing the work, and you’ll also spend a lot of time undoing the work. Don’t do work unless you’re sure you are going to want to COMMIT it. This sounds like common sense—of course I wouldn’t do all of the work unless I wanted to COMMIT it. However, I’ve often seen a developer use a “real” table as a temporary table, fill it up with data, report on it, and then roll back to get rid of the temporary data. Later, we’ll talk about true temporary tables and how to avoid this issue.

Investigating Redo

As part of processing transactions, Oracle captures how the data is modified (redo) and writes that information to the log buffer memory area. Next, the log writer background process will frequently write the redo information to disk (online redo logs). If you have archiving enabled in your database, as soon as an online redo log is filled up, the archiver process will copy the online redo log to an archived redo log. The architecture of processing transactions and the subsequent redo stream is displayed in Figure 9-5.
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig5_HTML.jpg
Figure 9-5

Oracle transacting data and writing redo

Redo management can be a point of serialization (bottleneck) within the database. This is because eventually all transactions end up at LGWR (or one of its worker processes) asking it to manage their redo and COMMIT their transaction. The amount of redo generated also influences how much work the archiver process has to process. If you have a standby database, this redo has to also be transferred and applied to the standby. The more log writer and the archiver have to do, the slower the system will be.

Therefore, as a developer, it’s important to be able to measure how much redo your operations generate. The more redo you generate, the longer your operations may take, and the slower the entire system might be. You are not just affecting your session, but every session. By seeing how much redo an operation tends to generate, and testing more than one approach to a problem, you can find the best way to do things.

Note

Oracle will always start at least one LGWR background process. On multiprocessor systems, Oracle will spawn additional log writer worker processes (LG00) to help improve performance of writing redo to disk.

Measuring Redo

In this first example, we’ll use AUTOTRACE to observe the amount of redo generated. In subsequent examples, we’ll use the GET_STAT_VAL function (introduced earlier in this chapter).

Note

You will not see the exact same results when you run these examples in your environment. Your results may vary depending on variables such as the number of records in BIG_TABLE, memory configuration, CPUs, and other processes running on your system.

Let’s take a look at the difference in redo generated by conventional path INSERTs (the normal INSERTs you and I do every day) and direct path INSERTs—used when loading large amounts of data into the database. We’ll use AUTOTRACE and the previously created tables T and BIG_TABLE for this simple example. First, we’ll load the table using a conventional path INSERT:
$ sqlplus eoda/foo@PDB1
SQL> set autotrace traceonly statistics;
SQL> truncate table t;
Table truncated.
SQL> insert into t select * from big_table;
1000000 rows created.
Statistics
----------------------------------------------------------
         90  recursive calls
     123808  db block gets
      39407  consistent gets
      13847  physical reads
  113875056  redo size
       1177  bytes sent via SQL*Net to client
       1354  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

As you can see, that INSERT generated about 113MB of redo.

Note

The example in this section was performed on a NOARCHIVELOG mode database. If you are in ARCHIVELOG mode, the table would have to be created or set as NOLOGGING to observe this dramatic change. We will investigate the NOLOGGING attribute in more detail shortly in the section “Setting NOLOGGING in SQL.” Please make sure to coordinate all nonlogged operations with your DBA on a “real” system.

When we use a direct path load in a NOARCHIVELOG mode database, we get the following results:
SQL> truncate table t;
Table truncated.
SQL> insert /*+ APPEND */ into t  select * from big_table;
1000000 rows created.
Statistics
----------------------------------------------------------
        551  recursive calls
      16645  db block gets
      15242  consistent gets
      13873  physical reads
     220504  redo size
       1160  bytes sent via SQL*Net to client
       1368  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         86  sorts (memory)
          0  sorts (disk)
    1000000  rows processed
SQL> set autotrace off

That INSERT generated only about 220KB—kilobytes, not megabytes—of redo. As you can see, the amount of redo generated by a direct path insert is much less than a conventional insert.

Can I Turn Off Redo Log Generation?

This question is often asked. The simple short answer is no, since redo logging is crucial for the database; it is not overhead and it is not a waste. You do need it, regardless of whether you believe you do or not. It is a fact of life, and it is the way the database works. If you turned off redo, then any temporary failure of disk drives, power, or a software crash would render the entire database unusable and unrecoverable. That said, however, there are some operations that can be done without generating redo log in some cases.

Note

Oracle allows you to place your database into FORCE LOGGING mode. In that case, all operations are logged regardless if you specify NOLOGGING. The query SELECT FORCE_LOGGING FROM V$DATABASE will display if logging forced or not. This feature is in support of Data Guard, a disaster recovery feature of Oracle that relies on redo to maintain a standby database copy.

Setting NOLOGGING in SQL

Some SQL statements and operations support the use of a NOLOGGING clause . This does not mean that all operations against the object will be performed without generating redo, just that some very specific operations will generate significantly less redo than normal. Note that I said “significantly less redo,” not “no redo.” All operations will generate some redo—all data dictionary operations will be logged regardless of the logging mode. The amount of redo generated can be significantly less, however. For this example of the NOLOGGING clause , I ran the following in a database running in ARCHIVELOG mode with no force logging enabled:
$ sqlplus eoda/foo@PDB1
SQL> select log_mode, force_logging from v$database;
LOG_MODE     FORCE_LOGGING
------------ -----------------
ARCHIVELOG   NO
SQL> drop table t purge;
Table dropped.
SQL> variable redo number
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> create table t  as  select * from all_objects;
Table created.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo) &#xF0C9;
|| ' bytes of redo generated...' );
4487796 bytes of redo generated...
PL/SQL procedure successfully completed.
That CREATE TABLE generated about 4MB of redo information (your results will vary depending on how many rows are inserted into table T). We’ll drop and re-create the table, in NOLOGGING mode this time:
SQL> drop table t;
Table dropped.
SQL> variable redo number
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> create table t NOLOGGING as  select * from all_objects;
Table created.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo) &#xF0C9;
|| ' bytes of redo generated...' );
90108 bytes of redo generated...
PL/SQL procedure successfully completed.

This time, we generated only 90KB of redo. As you can see, this makes a tremendous difference—4MB of redo vs. 90KB. The 4MB written in the first example is a copy of the actual table data itself; it was written to the redo log when the table was created without the NOLOGGING clause.

If you test this on a NOARCHIVELOG mode database, you will not see any differences between the two. The CREATE TABLE will not be logged, with the exception of the data dictionary modifications, in a NOARCHIVELOG mode database. Additionally, if you run this test in a database that has force logging enabled, you will always get redo logged, regardless of the NOLOGGING clause.

These facts point out a valuable tip: test your system in the mode it will be run in production, as the behavior may be different. Your production system will be running in ARCHIVELOG mode; if you perform lots of operations that generate redo in this mode, but not in NOARCHIVELOG mode, you’ll want to discover this during testing, not during rollout to the users!

Of course, it is now obvious that you will do everything you can with NOLOGGING, right? In fact, the answer is a resounding no. You must use this mode very carefully and only after discussing the issues with the person in charge of backup and recovery. Let’s say you create this table and it is now part of your application (e.g., you used a CREATE TABLE AS SELECT NOLOGGING as part of an upgrade script). Your users modify this table over the course of the day. That night, the disk that the table is on fails. “No problem,” the DBA says. “We are running in ARCHIVELOG mode and we can perform media recovery.” The problem is, however, that the initially created table, since it was not logged, is not recoverable from the archived redo log. This table is unrecoverable and this brings out the most important point about NOLOGGING operations: they must be coordinated with your DBA and the system as a whole. If you use them and others are not aware of that fact, you may compromise the ability of your DBA to recover your database fully after a media failure. NOLOGGING operations must be used judiciously and carefully.

The important things to note about NOLOGGING operations are as follows:
  • Some amount of redo will be generated, as a matter of fact. This redo is to protect the data dictionary. There is no avoiding this at all. It could be of a significantly lesser amount than before, but there will be some.

  • NOLOGGING does not prevent redo from being generated by all subsequent operations. In the preceding example, I did not create a table that is never logged. Only the single, individual operation of creating the table was not logged. All subsequent “normal” operations such as INSERTs, UPDATEs, DELETEs, and MERGEs will be logged. Other special operations, such as a direct path load using SQL*Loader or a direct path INSERT using the INSERT /*+ APPEND */ syntax, will not be logged (unless and until you ALTER the table and enable full logging again). In general, however, the operations your application performs against this table will be logged.

  • After performing NOLOGGING operations in an ARCHIVELOG mode database, you must take a new baseline backup of the affected datafiles as soon as possible, in order to avoid losing the data created by the NOLOGGING operation due to media failure. Since the data created by the NOLOGGING operation is not in the redo log files, and is not yet in the backups, you have no way of recovering it!

Setting NOLOGGING on an Index

There are two ways to use the NOLOGGING option. You have already seen one method—embedding the NOLOGGING keyword in the SQL command. The other method, which involves setting the NOLOGGING attribute on the segment (index or table ), allows certain operations to be performed implicitly in a NOLOGGING mode. For example, I can alter an index or a table to be NOLOGGING by default. This means for the index that subsequent rebuilds of this index will not be logged (the index will not generate redo; other indexes and the table itself might, but this index will not). Using the table T we just created, we can observe
$ sqlplus eoda/foo@PDB1
SQL> select log_mode, force_logging from v$database;
LOG_MODE     FORCE_LOGGING
------------ -----------------
ARCHIVELOG   NO
SQL> create index t_idx on t(object_name);
Index created.
SQL> variable redo number
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> alter index t_idx rebuild;
Index altered.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo)
                 || ' bytes of redo generated...');
672264 bytes of redo generated...
PL/SQL procedure successfully completed.
Note

Again, this example was performed in an ARCHIVELOG mode database. You would not see the differences in redo size in a NOARCHIVELOG mode database as the index CREATE and REBUILD operations are not logged in NOARCHIVELOG mode.

When the index is in LOGGING mode (the default), a rebuild of it generated about 600KB of redo. However, we can alter the index:
SQL> alter index t_idx nologging;
Index altered.
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> alter index t_idx rebuild;
Index altered.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo)
                    || ' bytes of redo generated...');
39352 bytes of redo generated...
PL/SQL procedure successfully completed.

And now it generates a mere 39KB of redo. But that index is “unprotected” now. If the datafiles it was located in failed and had to be restored from a backup, we would lose that index data. Understanding that fact is crucial. The index is not recoverable right now—we need a backup to take place. Alternatively, the DBA could just re-create the index as we can re-create the index directly from the table data as well.

NOLOGGING Wrap-Up

The operations that may be performed in a NOLOGGING mode are as follows:
  • Index creations and ALTERs (rebuilds).

  • Bulk INSERTs into a table using a direct path INSERT such as that available via the /*+ APPEND */ hint or SQL*Loader direct path loads. The table data will not generate redo, but all index modifications will (the indexes on this nonlogged table will generate redo).

  • LOB operations (updates to large objects do not have to be logged).

  • Table creations via CREATE TABLE AS SELECT.

  • Various ALTER TABLE operations such as MOVE and SPLIT.

Used appropriately on an ARCHIVELOG mode database, NOLOGGING can speed up many operations by dramatically reducing the amount of redo log generated. Suppose you have a table you need to move from one tablespace to another. You can schedule this operation to take place immediately before a backup occurs—you would ALTER the table to be NOLOGGING, move it, rebuild the indexes (without logging as well), and then ALTER the table back to logging mode. Now, an operation that might have taken X hours can happen in X/2 hours perhaps (I’m not promising a 50 percent reduction in runtime!). The appropriate use of this feature includes involving the DBA, or whoever is responsible for database backup and recovery or any standby databases. If that person is not aware that you’re using this feature and a media failure occurs, you may lose data, or the integrity of the standby database might be compromised. This is something to seriously consider.

Why Can’t I Allocate a New Log?

I get this question all of the time. You are getting warning messages to this effect (this will be found in alert.log on your server):
Thread 1 cannot allocate new log, sequence 1466
Checkpoint not complete
  Current log# 3 seq# 1465 mem# 0: /.../...redo03.log

It might say Archival required instead of Checkpoint not complete, but the effect is pretty much the same. This is really something the DBA should be looking out for. This message will be written to alert.log on the server whenever the database attempts to reuse an online redo log file and finds that it can’t. This happens when DBWn has not yet finished checkpointing the data protected by the redo log or ARCn has not finished copying the redo log file to the archive destination. At this point, the database effectively halts as far as the end user is concerned. It stops cold. DBWn or ARCn will be given priority to flush the blocks to disk. Upon completion of the checkpoint or archival, everything goes back to normal. The reason the database suspends user activity is that there is simply no place to record the changes the users are making. Oracle is attempting to reuse an online redo log file, but because either the file would be needed to recover the database in the event of a failure (Checkpoint not complete), or the archiver has not yet finished copying it (Archival required), Oracle must wait (and the end users will wait) until the redo log file can safely be reused.

If you see that your sessions spend a lot of time waiting on a “log file switch,” “log buffer space,” or “log file switch checkpoint or archival incomplete,” you are most likely hitting this. You will notice it during prolonged periods of database modifications if your log files are sized incorrectly or because DBWn and ARCn need to be tuned by the DBA or system administrator. I frequently see this issue with the “starter” database that has not been customized. The “starter” database typically sizes the redo logs far too small for any significant amount of work (including the initial database build of the data dictionary itself). As soon as you start loading up the database, you will notice that the first 1000 rows go fast, and then things start going in spurts: 1000 go fast, then hang, then go fast, then hang, and so on. These are the indications you are hitting this condition.

There are a couple of things you can do to solve this issue:
  • Make DBWn faster: Have your DBA tune DBWn by enabling ASYNC I/O, using DBWn I/O slaves, or using multiple DBWn processes . Look at the I/O on the system and see if one disk or a set of disks is “hot” and you need to therefore spread the data out. The same general advice applies for ARCn as well. The pros of this are that you get “something for nothing” here—increased performance without really changing any logic/structures/code. There really are no downsides to this approach.

  • Add more redo log files: This will postpone the Checkpoint not complete in some cases, and, after a while, it will postpone the Checkpoint not complete so long that it perhaps doesn’t happen (because you gave DBWn enough breathing room to checkpoint). The same applies to the Archival required message. The benefit of this approach is the removal of the “pauses” in your system. The downside is it consumes more disk, but the benefit far outweighs any downside here.

  • Re-create the log files with a larger size: This will extend the amount of time between the time you fill the online redo log and the time you need to reuse it. The same applies to the Archival required message, if the redo log file usage is “bursty.” If you have a period of massive log generation (nightly loads, batch processes) followed by periods of relative calm, then having larger online redo logs can buy enough time for ARCn to catch up during the calm periods. The pros and cons are identical to the preceding approach of adding more files. Additionally, it may postpone a checkpoint from happening until later, since checkpoints happen at each log switch (at least), and the log switches will now be further apart.

  • Make checkpointing happen more frequently and more continuously: Use a smaller block buffer cache (not entirely desirable) or various parameter settings such as FAST_START_MTTR_TARGET, LOG_CHECKPOINT_INTERVAL, and LOG_CHECKPOINT_TIMEOUT. This will force DBWn to flush dirty blocks more frequently. The benefit to this approach is that recovery time from a failure is reduced. There will always be less work in the online redo logs to be applied. The downside is that blocks may be written to disk more frequently if they are modified often. The buffer cache will not be as effective as it could be, and it can defeat the block cleanout mechanism discussed in the next section.

The approach you take will depend on your circumstances. This is something that must be fixed at the database level, taking the entire instance into consideration.

Block Cleanout

In this section, we’ll discuss block cleanouts, or the removal of “locking”-related information on the database blocks we’ve modified. This concept is important to understand when we talk about the infamous ORA-01555: snapshot too old error in a subsequent section.

If you recall from Chapter 6 we talked about data locks and how they are managed. I described how they are actually attributes of the data, stored on the block header. A side effect of this is that the next time that block is accessed, we may have to clean it out—in other words, remove the transaction information. This action generates redo and causes the block to become dirty if it wasn’t already, meaning that a simple SELECT may generate redo and may cause lots of blocks to be written to disk with the next checkpoint. Under most normal circumstances, however, this will not happen. If you have mostly small- to medium-sized transactions (OLTP), or you have a data warehouse that performs direct path loads or uses DBMS_STATS to analyze tables after load operations, you’ll find the blocks are generally cleaned for you. If you recall from the earlier section in this chapter, “What Does a COMMIT Do?”, one of the steps of COMMIT-time processing is to revisit some blocks if they are still in the SGA and if they are accessible (no one else is modifying them), and then clean them out. This activity is known as a commit cleanout and is the activity that cleans out the transaction information on our modified block. Optimally, our COMMIT can clean out the blocks so that a subsequent SELECT (read) will not have to clean it out. Only an UPDATE of this block would truly clean out our residual transaction information, and since the UPDATE is already generating redo, the cleanout is not noticeable.

We can force a cleanout to not happen, and therefore observe its side effects, by understanding how the commit cleanout works. In a commit list associated with our transaction, Oracle will record lists of blocks we have modified. Each of these lists is 20 blocks long, and Oracle will allocate as many of these lists as it needs—up to a point. If the sum of the blocks we modify exceeds 10 percent of the block buffer cache size, Oracle will stop allocating new lists. For example, if our buffer cache is set to cache 3000 blocks, Oracle will maintain a list of up to 300 blocks (10 percent of 3000). Upon COMMIT, Oracle will process each of these lists of 20 block pointers, and if the block is still available, it will perform a fast cleanout. So, as long as the number of blocks we modify does not exceed 10 percent of the number of blocks in the cache and our blocks are still in the cache and available to us, Oracle will clean them out upon COMMIT. Otherwise, it just skips them (i.e., does not clean them out).

With this understanding, we can set up artificial conditions to see how the cleanout works. I set my DB_CACHE_SIZE to a low value of 16MB, which is sufficient to hold 2048 8KB blocks (my block size is 8KB). Next, I create a table such that a row fits on exactly one block—I’ll never have two rows per block. Then I fill this table up with 10,000 rows and COMMIT. We know that 10,000 blocks far exceed 10 percent of 2048, so the database will not be able to clean out all of these dirty blocks upon commit—most of them will not even be in the buffer cache anymore. I’ll measure the amount of redo I’ve generated so far, run a SELECT that will visit each block, and then measure the amount of redo that SELECT generated.

Note

In order for this example to be reproducible and predictable, you’ll need to disable SGA automatic memory management. If that is enabled, there is a chance that the database will increase the size of your buffer cache—defeating the “math” I’ve worked out.

Surprisingly to many people, the SELECT will have generated redo. Not only that, but it will also have “dirtied” these modified blocks, causing DBWn to write them again. This is due to the block cleanout. Next, I’ll run the SELECT to visit every block once again and see that no redo is generated. This is expected, as the blocks are all “clean” at this point. We’ll start by creating our table:
$ sqlplus eoda/foo@PDB1
SQL> create table t
     ( id number primary key,
       x char(2000),
       y char(2000),
       z char(2000));
Table created.
SQL> exec dbms_stats.set_table_stats( user, 'T', numrows=>10000, numblks=>10000 );
PL/SQL procedure successfully completed.
I used DBMS_STATS to set table statistics so as to avoid any side effects from hard parsing later (Oracle tends to scan objects that have no statistics during a hard parse, and this side effect would interfere with my example!). So, this is my table with one row per block (in my 8KB block size database). Next, we’ll inspect the block of code we’ll be executing against this table:
SQL> declare
         l_rec t%rowtype;
     begin
         for i in 1 .. 10000
         loop
             select * into l_rec from t where id=i;
         end loop;
     end;
    /
declare
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at line 6
That block failed, but that’s OK—we knew it would since there is no data in the table yet. I ran that block simply to get the hard parse of the SQL and PL/SQL performed so when we run it later, we won’t have to worry about side effects from hard parsing being counted. Now we are ready to load the data into our table and commit:
SQL> insert into t
     select rownum, 'x', 'y', 'z'
       from all_objects
      where rownum <= 10000;
10000 rows created.
SQL> commit;
Commit complete.
And, finally, I’m ready to measure the amount of redo generated during the first read of the data:
SQL> variable redo number
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> declare
         l_rec t%rowtype;
     begin
         for i in 1 .. 10000
         loop
             select * into l_rec from t where id=i;
         end loop;
     end;
    /
PL/SQL procedure successfully completed.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo)
                           || ' bytes of redo generated...');
802632 bytes of redo generated...
PL/SQL procedure successfully completed.
So, this SELECT generated about 802KB of redo during its processing. This represents the block headers it modified during the index read of the primary key index and the subsequent table read of T. DBWn will be writing these modified blocks back out to disk at some point in the future (actually, since the table doesn’t fit into the cache, we know that DBWn has already written out at least some of them). Now, if I run the query again
SQL> exec :redo := get_stat_val( 'redo size' );
PL/SQL procedure successfully completed.
SQL> declare
         l_rec t%rowtype;
     begin
         for i in 1 .. 10000
         loop
             select * into l_rec from t where id=i;
         end loop;
     end;
    /
PL/SQL procedure successfully completed.
SQL> exec dbms_output.put_line( (get_stat_val('redo size')-:redo)
                 || ' bytes of redo generated...');
0 bytes of redo generated...
PL/SQL procedure successfully completed.

I see that no redo is generated—the blocks are all clean.

If we were to rerun the preceding example with the buffer cache set to hold a little more than 100,000 blocks, we’d find that we generate little to no redo on any of the SELECTs—we will not have to clean dirty blocks during either of our SELECT statements. This is because the10,000-plus (remember the index was modified as well) blocks we modified fit comfortably into ten percent of our buffer cache, and we are the only users. There is no one else mucking around with the data, and no one else is causing our data to be flushed to disk or accessing those blocks. In a live system, it would be normal for at least some of the blocks to not be cleaned out sometimes.

This behavior will most affect you after a large INSERT (as just demonstrated), UPDATE, or DELETE—one that affects many blocks in the database (anything more than ten percent of the size of the cache will definitely do it). You’ll notice that the first query to touch the block after this will generate a little redo and dirty the block, possibly causing it to be rewritten if DBWn had already flushed it or the instance had been shut down, clearing out the buffer cache altogether. There is not too much you can do about it. It is normal and to be expected. If Oracle didn’t do this deferred cleanout of a block, a COMMIT could take as long to process as the transaction itself. The COMMIT would have to revisit each and every block, possibly reading them in from disk again (they could have been flushed).

If you are not aware of block cleanouts and how they work, they will be one of those mysterious things that just seem to happen for no reason. For example, say you UPDATE a lot of data and COMMIT. Now you run a query against that data to verify the results. The query appears to generate tons of write I/O and redo. It seems impossible if you are unaware of block cleanouts; it was to me the first time I saw it. You go and get someone to observe this behavior with you, but it is not reproducible as the blocks are now “clean” on the second query. You simply write it off as one of those database mysteries—a mystery that only happens when you are alone.

In an OLTP system, you’ll probably never see a block cleanout happening, since those systems are characterized by small, short transactions that affect only a few blocks. By design, all or most of the transactions are short and sweet. Modify a couple of blocks and they all get cleaned out. In a warehouse where you make massive UPDATEs to the data after a load, block cleanouts may be a factor in your design. Some operations will create data on “clean” blocks. For example, CREATE TABLE AS SELECT, direct path loaded data, and direct path inserted (using the /* +APPEND */ hint) data will all create clean blocks. An UPDATE, normal INSERT, or DELETE may create blocks that need to be cleaned with the first read. This could really affect you if your processing consists of
  • Bulk-loading lots of new data into the data warehouse

  • Running UPDATEs on all of the data you just loaded (producing blocks that need to be cleaned out)

  • Letting people query the data

You have to realize that the first query to touch the data will incur some additional processing if the block needs to be cleaned. Realizing this, you yourself should “touch” the data after the UPDATE. You just loaded or modified a ton of data—you need to analyze it at the very least. Perhaps you need to run some reports to validate the load. This will clean the block out and make it so the next query doesn’t have to. Better yet, since you just bulk-loaded the data, you now need to refresh the statistics anyway. Running the DBMS_STATS utility to gather statistics may well clean out all of the blocks as it just uses SQL to query the information and would naturally clean the blocks out as it goes along.

Log Contention

This, like the cannot allocate new log message, is something the DBA must fix, typically in conjunction with the system administrator. However, it is something a developer might detect as well if the DBA isn’t watching closely enough.

If you are faced with log contention, what you might observe is a large wait time on the “log file sync” event and long write times evidenced in the “log file parallel write” event in a Statspack report. If you see this, you may be experiencing contention on the redo logs; they are not being written fast enough. This can happen for many reasons. One application reason (one the DBA can’t fix, but the developer must) is that you are committing too frequently—committing inside of a loop doing INSERTs, for example. As demonstrated earlier in the “What Does a COMMIT Do?” section, committing too frequently, aside from being a bad programming practice, is a surefire way to introduce lots of log file sync waits. Assuming all of your transactions are correctly sized (you are not committing more frequently than your business rules dictate), the most common causes for log file waits that I’ve seen are as follows:
  • Putting redo on a slow device: The disks are just performing poorly. It is time to buy faster disks.

  • Putting redo on the same device as other files that are accessed frequently: Redo is designed to be written with sequential writes and to be on dedicated devices. If other components of your system—even other Oracle components—are attempting to read and write to this device at the same time as LGWR, you will experience some degree of contention. Here, you want to ensure LGWR has exclusive access to these devices if at all possible.

  • Mounting the log devices in a buffered manner: Here, you are using a “cooked” file system (not RAW disks). The operating system is buffering the data, and the database is also buffering the data (redo log buffer). Double buffering slows things down. If possible, mount the devices in a “direct” fashion. How to do this varies by operating system and device, but it is usually possible.

  • Putting redo on a slow technology, such as RAID-5: RAID-5 is great for reads, but it is generally terrible for writes. As we saw earlier regarding what happens during a COMMIT, we must wait for LGWR to ensure the data is on disk. Using any technology that slows this down is not a good idea.

If at all possible, you really want at least five dedicated devices for logging and optimally six to mirror your archives as well. In these days of 200GB, 300GB, 1TB, and larger disks, this is getting harder, but if you can set aside four of the smallest, fastest disks you can find and one or two big ones, you can affect LGWR and ARCn in a positive fashion. To lay out the disks, you would break them into three groups (see Figure 9-6):
  • Redo log group 1: Disks 1 and 3

  • Redo log group 2: Disks 2 and 4

  • Archive: Disk 5 and optionally disk 6 (the big disks)

../images/319655_4_En_9_Chapter/319655_4_En_9_Fig6_HTML.jpg
Figure 9-6

Optimal redo log configuration

You would place redo log group 1 with members A and B onto disks 1 and 3. You would place redo log group 2 with members C and D onto disks 2 and 4. If you have groups 3, 4, and so on, they’d go onto the odd and even groups of disks, respectively. The effect of this is that LGWR, when the database is currently using group 1, will write to disks 1 and 3 simultaneously. When this group fills up, LGWR will move to disks 2 and 4. When they fill up, LGWR will go back to disks 1 and 3. Meanwhile, ARCn will be processing the full online redo logs and writing them to disks 5 and 6, the big disks. The net effect is neither ARCn nor LGWR is ever reading a disk being written to, or writing to a disk being read from, so there is no contention (see Figure 9-7).
../images/319655_4_En_9_Chapter/319655_4_En_9_Fig7_HTML.jpg
Figure 9-7

Redo log flow

So, when LGWR is writing group 1, ARCn is reading group 2 and writing to the archive disks. When LGWR is writing group 2, ARCn is reading group 1 and writing to the archive disks. In this fashion, LGWR and ARCn each have their own dedicated devices and will not be contending with anyone, not even each other.

Temporary Tables and Redo

Oracle comes with two types of temporary tables: global and private. Global temporary tables have been a feature within Oracle for a long time. Global temporary tables are permanent database objects that persist on disk and are visible to all sessions. They are named temporary tables because the data within them only persists for the duration of the transaction or session.

Starting with Oracle 18c, you can create a private temporary table. Unlike a global temporary table, a private temporary table exists only in memory and is only visible to the session that created it. You can define a private temporary table to persist per transaction or session duration, after which the temporary table is automatically dropped.

Note

If you’ve worked with other database technologies such as SQL Server or MySQL, a private temporary table aligns more with what you’ve used in those environments.

Even though global temporary tables have been around for a while, there is still some confusion surrounding them, in particular in the area of logging. In this section, we’ll also explore the question “How do temporary tables work with respect to logging of changes?”

Global Temporary Tables

You can instruct Oracle to store the undo for a temporary table in a temporary tablespace via the TEMP_UNDO_ENABLED parameter. When blocks are modified in a temporary tablespace, no redo is generated. Therefore, when TEMP_UNDO_ENABLED is set to TRUE, any DML issued against a temporary table will generate little or no redo.

Note

By default, TEMP_UNDO_ENABLED is set to FALSE.

To fully understand the impact of TEMP_UNDO_ENABLED on redo generation, let’s first look at the behavior when TEMP_UNDO_ENABLED is set to FALSE. I set up a small test to demonstrate the amount of redo generated while working with temporary tables, an indication therefore of the amount of undo generated for temporary tables, since only the undo is logged for them. To demonstrate, I’ll take identically configured permanent and temporary tables and then perform the same operations on each, measuring the amount of redo generated each time. The tables I’ll use are as follows:
$ sqlplus eoda/foo@PDB1
SQL> create table perm
     ( x char(2000) ,
       y char(2000) ,
       z char(2000)  );
Table created.
SQL> create global temporary table temp
     ( x char(2000) ,
       y char(2000) ,
       z char(2000)  )
     on commit preserve rows;
Table created.
I set up a small stored procedure to allow me to perform arbitrary SQL and report the amount of redo generated by that SQL. I’ll use this routine to perform INSERTs, UPDATEs, and DELETEs against both the temporary and permanent tables:
SQL> create or replace procedure do_sql( p_sql in varchar2 )
     as
         l_start_redo    number;
         l_redo          number;
     begin
             l_start_redo := get_stat_val( 'redo size' );
             execute immediate p_sql;
             commit;
            l_redo := get_stat_val( 'redo size' ) - l_start_redo;
             --
            dbms_output.put_line
            ( to_char(l_redo,'99,999,999') ||' bytes of redo generated for "' ||
             substr( replace( p_sql, chr(10), ' '), 1, 25 ) || '"...' );
    end;
  /
Procedure created.
Then I ran equivalent INSERTs, UPDATEs, and DELETEs against both the PERM and TEMP tables :
SQL> set serveroutput on format wrapped
SQL> begin
         do_sql( 'insert into perm
                  select 1,1,1
                    from all_objects
                   where rownum <= 500' );
         do_sql( 'insert into temp
                  select 1,1,1
                    from all_objects
                  where rownum <= 500' );
        dbms_output.new_line;
        do_sql( 'update perm set x = 2' );
        do_sql( 'update temp set x = 2' );
        dbms_output.new_line;
        do_sql( 'delete from perm' );
        do_sql( 'delete from temp' );
    end;
  /
  3,313,088 bytes of redo generated for "insert into perm"...
     72,584 bytes of redo generated for "insert into temp"...
  3,268,384 bytes of redo generated for "update perm set x = 2"...
  1,946,432 bytes of redo generated for "update temp set x = 2"...
  3,245,112 bytes of redo generated for "delete from perm"...
  3,224,460 bytes of redo generated for "delete from temp"...
 
PL/SQL procedure successfully completed.
As you can see
  • The INSERT into the “real” table generated a lot of redo, while almost no redo was generated for the temporary table. This makes sense—there is very little undo data generated for INSERTs, and only undo data is logged for temporary tables.

  • The UPDATE of the real table generated about twice the amount of redo as the temporary table. Again, this makes sense. About half of that UPDATE, the “before image,” had to be saved. The “after image” (redo) for the temporary table did not have to be saved.

  • The DELETEs each took about the same amount of redo space. This makes sense, because the undo for a DELETE is big, but the redo for the modified blocks is very small. Hence, a DELETE against a temporary table takes place very much in the same fashion as a DELETE against a permanent table.

Therefore, the following generalizations can be made regarding DML activity on temporary tables:
  • An INSERT will generate little to no undo/redo activity.

  • An UPDATE will generate about half the redo as with a permanent table.

  • A DELETE will generate the same amount of redo as with a permanent table.

Now, I’ll run the prior tests with the TEMP_UNDO_ENABLED set to TRUE. The TEMP_UNDO_ENABLED parameter can be set at the session or system level. Here’s an example of setting it to TRUE at the session level:
SQL> alter session set temp_undo_enabled=true;
Once enabled for a session, any modifications to data in a temporary table in that session will have a subsequent undo logged to the temporary tablespace. Any modifications to permanent tables will still have undo logged to the undo tablespace. To see the impact of this, I’ll run some code that displays the amount of redo generated when issuing transactions against a permanent table and a temporary table—with the only addition being that TEMP_UNDO_ENABLED is set to TRUE. Here is the output:
3,312,148 bytes of redo generated for "insert into perm"...
      376 bytes of redo generated for "insert into temp"...
2,203,788 bytes of redo generated for "update perm set x = 2"...
      376 bytes of redo generated for "update temp set x = 2"...
3,243,412 bytes of redo generated for "delete from perm"...
      376 bytes of redo generated for "delete from temp"...

The results are dramatic: a trivial amount of redo is generated by the INSERT, UPDATE, and DELETE statements in a temporary table. For environments where you perform large batch operations that transact against temporary tables, you can expect to see a significant reduction in the amount of redo generated.

Note

You may be wondering why there were 376 bytes of redo generated in the prior example’s output. As processes consume space within the database, Oracle does some internal housekeeping. These changes are recorded in the data dictionary, which in turn generates some redo and undo.

It’s worth noting that in an Oracle Active Data Guard configuration, you can issue DML statements directly on a temporary table that exists in a standby database. We can view the amount of redo generated for a temporary table in a standby database by running the same code again against a standby database. The only difference is the statements issuing transactions against permanent tables must be removed (because you cannot issue DML on a permanent table in a standby database). Here is the output showing that 0 bytes of redo are generated:
0 bytes of redo generated for "insert into temp"...
0 bytes of redo generated for "update temp set x = 2"...
0 bytes of redo generated for "delete from temp"...
Note

There’s no need to set TEMP_UNDO_ENABLED in the standby database. This is because temporary undo is always enabled in an Oracle Active Data Guard standby database.

Global temporary tables are often used for reporting purposes—like generating and storing intermediate query results. Oracle Active Data Guard is often used to offload reporting applications to the standby database. Couple global temporary tables with Oracle Active Data Guard, and you have a more powerful tool to address your reporting requirements.

Private Temporary Tables

In regard to undo generation, private temporary tables behave similar to global temporary tables. We can see that by running the same tests from the prior section to measure the amount of undo generated when using a temporary table. First, let’s create the permanent and the private temporary tables:
$ sqlplus eoda/foo@PDB1
SQL> alter session set temp_undo_enabled=false;
SQL> create table perm
  ( x char(2000) ,
    y char(2000) ,
    z char(2000)  );
SQL> CREATE PRIVATE TEMPORARY TABLE ora$ptt_temp
  ( x char(2000) ,
    y char(2000) ,
    z char(2000))
ON COMMIT PRESERVE DEFINITION;
Next, let’s measure the amount of redo generated when transacting against both of the prior tables:
SQL> set serveroutput on format wrapped
SQL> begin
      do_sql( 'insert into perm
               select 1,1,1
               from all_objects
               where rownum <= 500' );
      do_sql( 'insert into ora$ptt_temp
               select 1,1,1
               from all_objects
               where rownum <= 500' );
      dbms_output.new_line;
      do_sql( 'update perm set x = 2' );
      do_sql( 'update ora$ptt_temp set x = 2' );
      dbms_output.new_line;
      do_sql( 'delete from perm' );
      do_sql( 'delete from ora$ptt_temp' );
end;
/
As you can see, the redo generated is identical to the behavior of a global temporary table with TEMP_UNDO_ENABLED set to FALSE:
  3,324,888 bytes of redo generated for "insert into perm         "...
     72,548 bytes of redo generated for "insert into ora$ptt_temp "...
  5,408,292 bytes of redo generated for "update perm set x = 2"...
  1,937,060 bytes of redo generated for "update ora$ptt_temp set x"...
  3,252,296 bytes of redo generated for "delete from perm"...
  3,225,464 bytes of redo generated for "delete from ora$ptt_temp"...
If we run the same tests with TEMP_UNDO_ENABLED set to TRUE, you can see the dramatic difference in undo generation:
  3,324,828 bytes of redo generated for "insert into perm         "...
        384 bytes of redo generated for "insert into ora$ptt_temp "...
  2,201,092 bytes of redo generated for "update perm set x = 2"...
        384 bytes of redo generated for "update ora$ptt_temp set x"...
  3,251,936 bytes of redo generated for "delete from perm"...
        384 bytes of redo generated for "delete from ora$ptt_temp"...

The bottom line is that when you use either global temporary tables or private temporary tables, if you want to minimize undo generation, set TEMP_UNDO_ENABLED to TRUE.

Investigating Undo

The Oracle database creates and manages information used to roll back (or undo) changes to the database. The most obvious use of undo is when you issue a ROLLBACK statement to undo changes to data that you don’t want committed. Here is the complete list of uses of undo:
  • Roll back transactions via the ROLLBACK statement

  • Enable read consistency

  • Recover the database

  • Analyze data as of a prior point in time using Oracle Flashback Query

  • Recover from logical corruptions using Oracle Flashback features

We’ve already discussed several of the prior undo segment topics. We’ve seen how they are used during recovery, how they interact with the redo logs, and how they are used for consistent reads and nonblocking reads of data. In this section, we’ll look at the most frequently raised issues with undo segments.

The bulk of our time will be spent on the infamous ORA-01555: snapshot too old error, as this single issue causes more confusion than any other topic in the entire set of database topics. Before we do this, however, we’ll investigate one other undo-related issue: the question of what type of DML operation generates the most and least undo (you might already be able to answer that yourself, given the examples in the preceding section with temporary tables).

What Generates the Most and Least Undo?

This is a frequently asked but easily answered question. The presence of indexes (or the fact that a table is an index-organized table) may affect the amount of undo generated dramatically, as indexes are complex data structures and may generate copious amounts of undo information.

That said, an INSERT will, in general, generate the least amount of undo, since all Oracle needs to record for this is a rowid to “delete.” An UPDATE is typically second in the race (in most cases). All that needs to be recorded are the changed bytes. It is most common that you UPDATE some small fraction of the entire row’s data. Therefore, a small fraction of the row must be remembered in the undo. Many of the previous examples run counter to this rule of thumb, but that’s because they update large, fixed-size rows and they update the entire row. It is much more common to UPDATE a row and change a small percentage of the total row. A DELETE will, in general, generate the most undo. For a DELETE, Oracle must record the entire row’s before image into the undo segment. The previous temporary table example, with regard to redo generation, demonstrated that fact: the DELETE generated the most redo, and since the only logged element of the DML operation on a temporary table is the undo, we in fact observed that the DELETE generated the most undo. The INSERT generated very little undo that needed to be logged. The UPDATE generated an amount equal to the before image of the data that was changed, and the DELETE generated the entire set of data written to the undo segment.

As previously mentioned, you must also take into consideration the work performed on an index. You’ll find that an update of an unindexed column not only executes much faster, it also tends to generate significantly less undo than an update of an indexed column. For example, we’ll create a table with two columns, both containing the same information, and index one of them:
$ sqlplus eoda/foo@PDB1
SQL> create table t
     as
     select object_name unindexed,
            object_name indexed
       from all_objects;
Table created.
SQL> create index t_idx on t(indexed);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'T');
PL/SQL procedure successfully completed.
Note

You will not see the exact same results when you run these tests on your database. It will depend on the number of objects in your database, which will be different from database to database.

Now we’ll update the table, first updating the unindexed column and then the indexed column. We’ll need a new V$ query to measure the amount of undo we’ve generated in each case. The following query accomplishes this for us. It works by getting our session ID (SID) from V$MYSTAT, using that to find our record in the V$SESSION view, and retrieving the transaction address (TADDR) . It uses the TADDR to pull up our V$TRANSACTION record (if any) and selects the USED_UBLK column—the number of used undo blocks. Since we currently are not in a transaction, we expect it to return zero rows right now:
SQL> select used_ublk
        from v$transaction
        where addr = (select taddr
                      from v$session
                      where sid = (select sid
                                   from v$mystat
                                   where rownum = 1
                              )
                     );
no rows selected
But the query will return a row after the UPDATE starts a transaction:
SQL> update t set unindexed = lower(unindexed);
72522 rows updated.
SQL> select used_ublk
       from v$transaction
      where addr = (select taddr
                      from v$session
                     where sid = (select sid
                                   from v$mystat
                                   where rownum = 1
                                 )
                   );
USED_UBLK
----------
      1369
SQL> commit;
Commit complete.
That UPDATE used 1369 blocks to store its undo. The commit would free that up, or release it, so if we rerun the query against V$TRANSACTION, it would once again show us no rows selected. When we update the same data—only indexed this time—we’ll observe the following:
SQL> update t set indexed = lower(indexed);
72522 rows updated.
SQL> select used_ublk
       from v$transaction
      where addr = (select taddr
                     from v$session
                     where sid = (select sid
                                   from v$mystat
                                   where rownum = 1
                                 )
                   );
USED_UBLK
----------
      3154

As you can see, updating that indexed column in this example generated several times as much undo. This is due to the inherit complexity of the index structure itself and the fact that we updated every single row in the table—moving every single index key value in this structure.

ORA-01555: Snapshot Too Old Error

In the last chapter, we briefly investigated the ORA-01555 error and looked at one cause of it: committing too frequently. Here, we’ll take a much more detailed look at the causes and solutions for the ORA-01555 error. ORA-01555 is one of those errors that confound people. It is the foundation for many myths, inaccuracies, and suppositions.

Note

ORA-01555 is not related to data corruption or data loss at all. It is a “safe” error in that regard; the only outcome is that the query that received this error is unable to continue processing.

The error is actually straightforward and has only two real causes, but since there’s a special case of one of them that happens so frequently, I’ll say that there are three:
  • The undo segments are too small for the work you perform on your system.

  • Your programs fetch across COMMITs (actually a variation on the preceding point). We covered this in Chapter 8.

  • Block cleanout.

The first two points are directly related to Oracle’s read consistency model. As you recall from Chapter 7, the results of your query are preordained, meaning they are well defined before Oracle goes to retrieve even the first row. Oracle provides this consistent point-in-time “snapshot” of the database by using the undo segments to roll back blocks that have changed since your query began. Every statement you execute, such as the following
SQL> update t set x = 5 where x = 2;
SQL> insert into t select * from t where x = 2;
SQL> delete from t where x = 2;
SQL> select * from t where x = 2;

will see a read-consistent view of T and the set of rows where X=2, regardless of any other concurrent activity in the database.

Note

The four statements presented here are just examples of the types of statements that would see a read-consistent view of T. They are not meant to be run as a single transaction in the database, as the first update would cause the following three statements to see no records. They are purely illustrative.

All statements that “read” the table take advantage of this read consistency. In the example just shown, the UPDATE reads the table to find rows where x=2 (and then UPDATEs them). The INSERT reads the table to find rows where X=2, and then INSERTs them, and so on. It is this dual use of the undo segments, both to roll back failed transactions and to provide for read consistency that results in the ORA-01555 error.

The third item in the previous list is a more insidious cause of ORA-01555 in that it can happen in a database where there is a single session, and this session is not modifying the tables that are being queried when the ORA-01555 error is raised! This doesn’t seem possible—why would we need undo data for a table we can guarantee is not being modified? We’ll find out shortly.

Before we take a look at all three cases with illustrations, I’d like to share with you the solutions to the ORA-01555 error, in general:
  • If your undo tablespace has autoextend enabled, set the parameter UNDO_RETENTION properly (larger than the amount of time it takes to execute your longest-running transaction or the longest expected flashback operation). V$UNDOSTAT can be used to determine the duration of your long-running queries. Also, ensure sufficient space on disk has been set aside so the undo segments are allowed to grow to the size they need to be based on the requested UNDO_RETENTION.

  • If your undo tablespace is a fixed size, then the UNDO_RETENTION parameter is ignored, and Oracle automatically tunes undo. For fixed-size undo tablespaces, choose a size sufficiently large enough to accommodate long-running queries and flashback operations.

  • Reduce the runtime of your query (tune it). This is always a good thing if possible, so it might be the first thing you try. It reduces the need for larger undo segments. This method goes toward solving the previous points.

  • Gather statistics on related objects. This helps avoid the block cleanout point listed earlier. Since the block cleanout is the result of a very large mass UPDATE or INSERT, statistics gathering needs to be done anyway after a mass UPDATE or large load.

We’ll come back to these solutions, as they are important to know. It seemed appropriate to display them prominently before we begin.

Undo Segments Are in Fact Too Small

The scenario is this: you have a system where the transactions are small. As a result, you need very little undo segment space allocated. Say, for example, the following is true:

  • Each transaction generates 8KB of undo on average.

  • You do five of these transactions per second on average (40KB of undo per second, 2400KB per minute).

  • You have a transaction that generates 1MB of undo that occurs once per minute on average. In total, you generate about 3.5MB of undo per minute.

  • You have an exceedingly small amount of undo configured for your system.

That is more than sufficient undo for this database when processing transactions. The undo segments will wrap around and reuse space about every three to four minutes or so, on average. If you sized undo segments based on your transactions that do modifications, you did all right.

In this same environment, however, you have some reporting needs. Some of these queries take a really long time to run—five minutes, perhaps. Here is where the problem comes in. If these queries take five minutes to execute and they need a view of the data as it existed when the query began, you have a very good probability of the ORA-01555 error occurring. Since your undo segments will wrap during this query execution, you know that some undo information generated since your query began is gone—it has been overwritten. If you hit a block that was modified near the time you started your query, the undo information for this block will be missing, and you will receive the ORA-01555 error.

Here’s a small example. Let’s say we have a table with blocks 1, 2, 3, … 1,000,000 in it. Table 9-2 shows a sequence of events that could occur.
Table 9-2

Long-Running Query Timeline

Time (Minutes:Seconds)

Action

0:00

Our query begins.

0:01

Another session UPDATEs block 1,000,000. Undo information for this is recorded into some undo segment.

0:01

This UPDATE session COMMITs. The undo data it generated is still there, but is now subject to being overwritten if we need the space.

1:00

Our query is still chugging along. It is at block 200,000.

1:01

Lots of activity going on. We have generated a little over 14MB of undo by now.

3:00

Our query is still going strong. We are at block 600,000 or so by now.

4:00

Our undo segments start to wrap around and reuse the space that was active when our query began at time 0:00. Specifically, we have just reused the undo segment space that the UPDATE to block 1,000,000 used back at time 0:01.

5:00

Our query finally gets to block 1,000,000. It finds it has been modified since the query began. It goes to the undo segment and attempts to find the undo for that block to get a consistent read on it. At this point, it discovers the information it needs no longer exists. ORA-01555 is raised and the query fails.

This is all it takes. If your undo segments are sized such that they have a good chance of being reused during the execution of your queries, and your queries access data that will probably be modified, you stand a very good chance of hitting the ORA-01555 error on a recurring basis. If this is the case, you must set your UNDO_RETENTION parameter higher and let Oracle take care of figuring out how much undo to retain (this is the suggested approach; it’s much easier than trying to figure out the perfect undo size yourself) or resize your undo segments and make them larger (or have more of them). You need enough undo configured to last as long as your long-running queries. The system was sized for the transactions that modify data—you forgot to size for the other components of the system.

There are two methods to manage undo in the system:
  • Automatic undo management: Here, Oracle is told how long to retain undo for, via the UNDO_RETENTION parameter. Oracle will determine how many undo segments to create based on concurrent workload and how big each should be. The database can even reallocate extents between individual undo segments at runtime to meet the UNDO_RETENTION goal set by the DBA. This is the recommended approach for undo management.

  • Manual undo management: Don’t use this method. This is a relic of the early versions of Oracle.

With manual undo management, DBAs had the choice of manually setting the size and number of the undo segments. Don’t use the manual method. Knowledge of this topic is really only relevant when trying to impress older DBAs on the team.

Under automatic undo management, things are much easier from the ORA-01555 perspective. The DBA first creates an autoextending undo tablespace and then tells the database how long the longest-running query (or flashback operation) is and sets that value in the UNDO_RETENTION parameter. Oracle will attempt to preserve undo for at least that duration of time. If sufficient space to grow has been allocated, Oracle will extend an undo segment and not wrap around—in trying to obey the UNDO_RETENTION period. This is in direct contrast to manually managed undo, which will wrap around and reuse undo space as soon as it can. It is primarily for this reason, the support of the UNDO_RETENTION parameter, that I highly recommend automatic undo management whenever possible. That single parameter reduces the possibility of an ORA-01555 error greatly (when it is set appropriately).

If you have a fixed-size undo tablespace, then the UNDO_RETENTION parameter is ignored by Oracle. In this configuration, you can use tools such as the Undo Advisor to help you appropriately size your undo tablespace. The Undo Advisor is accessible via Enterprise Manager or the DBMS_ADVISOR PL/SQL packages. The two most important considerations in sizing your undo are the length of the longest-running query and the longest interval you require for flashback operations.

I am getting a little too deep into the DBA role at this point, so we’ll move on to the next case. It’s just important that you understand that the ORA-01555 error in this case is due to the system not being sized correctly for your workload. The only solution is to size correctly for your workload. It is not your fault, but it is your problem since you hit it. It’s the same as if you run out of temporary space during a query. You either configure sufficient temporary space for the system, or you rewrite the queries so they use a plan that does not require temporary space.

To demonstrate this effect, we can set up a small, but somewhat artificial, test. We’ll create a very small undo tablespace with one session that will generate many small transactions, virtually assuring us that it will wrap around and reuse its allocated space many times—regardless of the UNDO_RETENTION setting , since we are not permitting the undo tablespace to grow. The session that uses this undo segment will be modifying a table, T. It will use a full scan of T and read it from “top” to “bottom.” In another session, we will execute a query that will read the table T via an index. In this fashion, it will read the table somewhat randomly: it will read row 1, then row 1000, then row 500, then row 20,001, and so on. In this way, we will tend to visit blocks very randomly and perhaps many times during the processing of our query. The odds of getting an ORA-01555 error in this case are virtually 100 percent.

So, in one session we start by connecting to the pluggable database (not the root container) and creating a new undo tablespace within it:
$ sqlplus eoda/foo@PDB1
SQL> create undo tablespace undo_small
     datafile '/tmp/undo.dbf' size 2m
     autoextend off;
Tablespace created.
SQL> alter system set undo_tablespace = undo_small;
System altered.
Now, we’ll set up the table T to query and modify. Note that we are ordering the data randomly in this table. The CREATE TABLE AS SELECT tends to put the rows in the blocks in the order it fetches them from the query. We’ll just scramble the rows up so they are not artificially sorted in any order, randomizing their distribution:
SQL> drop table t purge;
Table dropped.
SQL> create table t
     as
     select *
       from all_objects
      order by dbms_random.random;
Table created.
SQL> alter table t add constraint t_pk primary key(object_id);
Table altered.
SQL> exec dbms_stats.gather_table_stats( user, 'T', cascade=> true );
PL/SQL procedure successfully completed.
And now we are ready to do our modifications:
SQL> begin
         for x in ( select rowid rid from t )
         loop
             update t set object_name = lower(object_name) where rowid = x.rid;
             commit;
         end loop;
     end;
    /
Now, while that PL/SQL block of code is running, we will run a query in another session. That other query will read table T and process each record. It will spend about 1/100 of a second processing each record before fetching the next (simulated using DBMS_LOCK.SLEEP(0.01)). We will use the FIRST_ROWS hint in the query to have it use the index we created to read the rows out of the table via the index sorted by OBJECT_ID. Since the data was randomly inserted into the table, we would tend to query blocks in the table rather randomly. This block will only run for a couple of seconds before failing:
SQL> declare
         cursor c is
         select /*+ first_rows */ object_name
           from t
          order by object_id;
         l_object_name t.object_name%type;
         l_rowcnt      number := 0;
     begin
        open c;
        loop
            fetch c into l_object_name;
            exit when c%notfound;
            dbms_lock.sleep( 0.01 );
            l_rowcnt := l_rowcnt+1;
        end loop;
        close c;
    exception
        when others then
            dbms_output.put_line( 'rows fetched = ' || l_rowcnt );
            raise;
  end;
  /
rows fetched = 191
declare
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 39 with name
"_SYSSMU39_2997928315$" too small
As you can see, it processed only a handful of records before failing with the ORA-01555: snapshot too old error. To correct this, we want to make sure two things are done:
  • UNDO_RETENTION is set in the database to be at least long enough for this read process to complete. That will allow the database to grow the undo tablespace to hold sufficient undo for us to complete.

  • The undo tablespace is allowed to grow, or you manually allocate more disk space to it.

For this example, I have determined my long-running process takes about 720 seconds to complete (I have about 72,000 records in the table, so at 0.01 seconds per row we have 720 seconds). My UNDO_RETENTION is set to 900 (this is in seconds, so the undo retention is about 15 minutes). I altered the undo tablespace’s datafile to permit it to grow by 1MB at a time, up to 2GB in size:
SQL> alter database
     datafile '/tmp/undo.dbf'
     autoextend on
     next 1m
     maxsize 2048m;
Database altered.
When I ran the processes concurrently again, both ran to completion. The undo tablespace’s datafile grew this time, because it was allowed to and the undo retention I set up said to.
SQL> select bytes/1024/1024
       from dba_data_files
     where tablespace_name = 'UNDO_SMALL';
BYTES/1024/1024
---------------
             21

So, instead of receiving an error, we completed successfully, and the undo grew to be large enough to accommodate our needs. It is true that in this example, getting the error was purely due to the fact that we read the table T via the index and performed random reads all over the table. If we had rapidly full scanned the table instead, there is a good chance we would not have received the ORA-01555 error in this particular case. This is because both the SELECT and UPDATE would have been full scanning T, and the SELECT could most likely race ahead of the UPDATE during its scan (the SELECT just has to read, but the UPDATE must read and update and therefore could go slower). By doing the random reads, we increase the probability that the SELECT will need to read a block, which the UPDATE modified and committed many rows ago. This just demonstrates the somewhat insidious nature of the ORA-01555 error. Its occurrence depends on how concurrent sessions access and manipulate the underlying tables.

Delayed Block Cleanout

This cause of the ORA-01555 error is hard to eliminate entirely, but it is rare anyway, as the circumstances under which it occurs do not happen frequently. We have already discussed the block cleanout mechanism, but to summarize, it is the process whereby the next session to access a block after it has been modified may have to check to see if the transaction that last modified the block is still active. Once the process determines that the transaction is not active, it cleans out the block so that the next session to access it does not have to go through the same process again. To clean out the block, Oracle determines the undo segment used for the previous transaction (from the block’s header) and then determines whether the undo header indicates that the transaction has been committed and, if so, when it committed. This confirmation is accomplished in one of two ways. One way is that Oracle can determine that the transaction committed a long time ago, even though its transaction slot has been overwritten in the undo segment transaction table. The other way is that the COMMIT SCN is still in the transaction table of the undo segment, meaning the transaction committed a short time ago, and its transaction slot hasn’t been overwritten.

To receive the ORA-01555 error from a delayed block cleanout, all of the following conditions must be met:
  • A modification is made and COMMITed, and the blocks are not cleaned out automatically (e.g., the transaction modified more blocks than can fit in ten percent of the SGA block buffer cache).

  • These blocks are not touched by another session and will not be touched until our unfortunate query (displayed shortly) hits it.

  • A long-running query begins. This query will ultimately read some of those blocks from earlier. This query starts at SCN t1, the read-consistent SCN it must roll data back to in order to achieve read consistency. The transaction entry for the modification transaction is still in the undo segment transaction table when we begin.

  • During the query, many commits are made in the system. These transactions don’t touch the blocks in question (if they did, we wouldn’t have the impending problem as they would clean out the old transaction—solving the cleanout issue).

  • The transaction tables in the undo segments roll around and reuse slots due to the high degree of COMMITs. Most important, the transaction entry for the original modification transaction is cycled over and reused. In addition, the system has reused undo segment extents, preventing a consistent read on the undo segment header block itself.

  • Additionally, the lowest SCN recorded in the undo segment now exceeds t1 (it is higher than the read-consistent SCN of the query), due to the large number of commits.

When our query gets to the block that was modified and committed before it began, it is in trouble. Normally, it would go to the undo segment pointed to by the block and find the status of the transaction that modified it (in other words, it would find the COMMIT SCN of that transaction). If the COMMIT SCN is less than t1, our query can use this block. If the COMMIT SCN is greater than t1, our query must roll back that block. The problem is, however, that our query is unable to determine in this particular case if the COMMIT SCN of the block is greater than or less than t1. It is unsure as to whether it can use that block image or not. The ORA-01555 error then results.

To see this, we will create many blocks in a table that need to be cleaned out. We will then open a cursor on that table and allow many small transactions to take place against some other table—not the table we just updated and opened the cursor on. Finally, we will attempt to fetch the data for the cursor. Now, we know that the data required by the cursor will be “OK”—we should be able to see all of it since the modifications to the table would have taken place and been committed before we open the cursor. When we get an ORA-01555 error this time, it will be because of the previously described issue with delayed block cleanout. To set up for this example, we’ll use
  • The 4MB UNDO_SMALL undo tablespace.

  • A 300MB SGA, this is so we can get some dirty blocks flushed to disk to observe this phenomenon.

Before we start, we’ll create the undo tablespace and the “big” table we’ll be querying:
$ sqlplus eoda/foo@PDB1
SQL> create undo tablespace undo_small
     datafile '/tmp/undo.dbf' size 4m
     autoextend off;
Tablespace created.
SQL> create table big  as
     select a.*, rpad('*',1000,'*') data
      from all_objects a;
Table created.
SQL> alter table big add constraint big_pk  primary key(object_id);
Table altered.
SQL> exec dbms_stats.gather_table_stats( user, 'BIG' );
PL/SQL procedure successfully completed.
Note

You might wonder why I didn’t use CASCADE=>TRUE on the gather statistics call to gather statistics on the index created by default by the primary key constraint. That is because a CREATE INDEX or ALTER INDEX REBUILD has implicit compute statistics added to it already whenever the table it is indexing is not empty. So, the very act of creating the index has the side effect of gathering statistics on itself. There’s no need to regather the statistics we already have.

The previous table will have lots of blocks as we get about six or seven rows per block using that big data field, and my ALL_OBJECTS table has over 70,000 rows. Next, we’ll create the small table the many little transactions will modify:
SQL> create table small ( x int, y char(500) );
Table created.
SQL> insert into small select rownum, 'x' from all_users;
25 rows created.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats( user, 'SMALL' );
PL/SQL procedure successfully completed.
Now, we’ll dirty up that big table. We have a very small undo tablespace, so we’ll want to update as many blocks of this big table as possible, all while generating the least amount of undo possible. We’ll use a fancy UPDATE statement to do that. Basically, the following subquery is finding the “first” rowid of a row on every block. That subquery will return a rowid for every database block identifying a single row on it. We’ll update that row, setting a VARCHAR2(1) field. This will let us update all of the blocks in the table (some 8000 plus in the example), flooding the buffer cache with dirty blocks that will have to be written out. We’ll make sure we are using that small undo tablespace as well. To accomplish this and not exceed the capacity of our undo tablespace, we’ll craft an UPDATE statement that will update just the “first row” on each block. The ROW_NUMBER() built-in analytic function is instrumental in this operation; it assigns the number 1 to the “first row” by database block in the table, which would be the single row on the block we would update:
SQL> alter system set undo_tablespace = undo_small;
System altered.
SQL> update big
        set temporary = temporary
      where rowid in
     (
     select r
       from (
     select rowid r, row_number() over
            (partition by dbms_rowid.rowid_block_number(rowid) order by rowid) rn
       from big
           )
     where rn = 1
    );
3064 rows updated.
SQL> commit;
Commit complete.
OK, so now we know that we have lots of dirty blocks on disk. We definitely wrote some of them out, because we just didn’t have the room to hold them all. Next, we will open a cursor, but it won’t yet fetch a single row. Remember, when we open the cursor, the resultset is preordained, so even though Oracle did not actually process a row of data, the act of opening that resultset fixed the point in time the results must be “as of.” Now since we’ll be fetching the data we just updated and committed, and we know no one else is modifying the data, we should be able to retrieve the rows without needing any undo at all. But that’s where the delayed block cleanout rears its head. The transaction that modified these blocks is so new that Oracle will be obliged to verify that it committed before we begin, and if we overwrite that information (also stored in the undo tablespace), the query will fail. So, here is the opening of the cursor:
SQL> variable x refcursor
SQL> exec open :x for select * from big where object_id < 100;
PL/SQL procedure successfully completed.
SQL> !./run.sh
The run.sh file is a shell script; it simply fired off several SQL*Plus sessions using a command:
$ORACLE_HOME/bin/sqlplus eoda/foo@localhost:1521/PDB1 @test2 1  &
$ORACLE_HOME/bin/sqlplus eoda/foo@localhost:1521/PDB1 @test2 2  &
$ORACLE_HOME/bin/sqlplus eoda/foo@localhost:1521/PDB1 @test2 3  &
...
In the prior code, each SQL*Plus session was passed a different number (that was number 1; there was a 2, 3, and so on). In the prior script, ensure you replace the username and password with the username and password for your environment. The script test2.sql they each ran is as follows:
begin
    for i in 1 .. 5000
    loop
        update small set y = i where x= &1;
        commit;
    end loop;
end;
/
exit
So, we had several sessions inside of a tight loop initiate many transactions. The run.sh script waited for the several SQL*Plus sessions to complete their work, and then we returned to our session, the one with the open cursor. Upon attempting to print it out, we observe the following:
SQL> print x
ORA-01555: snapshot too old: rollback segment number 44 with name
"_SYSSMU44_3913812538$" too small
no rows selected
As I said, the preceding is a rare case. It took a lot of conditions, all of which must exist simultaneously to occur. We needed blocks that were in need of a cleanout to exist, and these blocks are scarce. A DBMS_STATS call to collect statistics gets rid of them so the most common causes—large mass updates and bulk loads—should not be a concern, since the tables need to be analyzed after such operations anyway. Most transactions tend to touch less than ten percent of the blocks in the buffer cache; hence, they do not generate blocks that need to be cleaned out. If you believe you’ve encountered this issue, in which a SELECT against a table that has no other DML applied to it is raising the ORA-01555 error , try the following solutions:
  • Ensure you are using “right-sized” transactions in the first place. Make sure you are not committing more frequently than you should.

  • Use DBMS_STATS to scan the related objects, cleaning them out after the load. Since the block cleanout is the result of a very large mass UPDATE or INSERT, this needs to be done anyway.

  • Allow the undo tablespace to grow by giving it the room to extend and increasing the undo retention. This decreases the likelihood of an undo segment transaction table slot being overwritten during the course of your long-running query. This is the same as the solution for the other cause of an ORA-01555 error (the two are very much related; you experience undo segment reuse during the processing of your query). In fact, I reran the preceding example with the undo tablespace set to autoextend 1MB at a time, with an undo retention of 900 seconds. The query against the table BIG completed successfully.

  • Reduce the runtime of your query—tune it. This is always good if possible, so it might be the first thing you try.

One last comment, in the database that you ran the prior experiment in, don’t forget to set your undo tablespace back to the original one, for example:
SQL> alter system set undo_tablespace=undotbs2;
SQL> drop tablespace undo_small including contents and datafiles;

That way you ensure you’re not running your database with an extremely small undo tablespace.

Summary

In this chapter, we investigated how to measure redo. We also looked at the impact that NOLOGGING has on redo generation. When used in combination with direct path operations (e.g., direct path insert), the generation of redo can be reduced dramatically. However, for regular DML statements, the NOLOGGING clause has no effect.

We explored the reasons why a log switch might be delayed. Oracle won’t allow a redo log to be overwritten in the event that DBWn has not yet finished checkpointing the data protected by the redo log or ARCn has not finished copying the redo log file to the archive destination. This is mainly a problem for the DBA to detect (inspect the alert.log) and manage.

We discussed how redo is handled with transactions that occur in temporary tables. In 12c and above, the amount of redo can be reduced to nearly nothing. For applications that use temporary tables, this can have a positive impact on performance.

In this chapter, we also investigated which statements generate the least and most undo. In general, an INSERT generates the least amount, an UPDATE generates more than INSERT, and a DELETE generates the most undo.

Lastly, we explored the causes of the infamous ORA-01555 error (snapshot too old). This error can occur because the undo tablespace has been sized too small. The DBA must ensure that the undo tablespace is sized large enough mostly to eliminate this as a cause for the error. We also looked at how a delayed block cleanout can cause issues. If you’ve correctly sized your transactions and your undo tablespace, you will probably rarely run into this error. Tuning the query that throws the ORA-01555 error should always be one of the first methods employed to resolve the issue.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.27.75