Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

D. Kuhn, T. KyteExpert Oracle Database Architecturehttps://doi.org/10.1007/978-1-4842-7499-6_1

1. Developing Successful Oracle Applications

Darl Kuhn¹ and Thomas Kyte²

(1)

Morrison, CO, USA

(2)

Denver, CO, USA

I spend the bulk of my time working with Oracle database software and, more to the point, with people who use this software. Over the last dozen or more years, I’ve worked on many projects—successful ones as well as complete failures—and if I were to encapsulate my experiences into a few broad statements, here’s what they would be:

An application built around the database—dependent on the database—will succeed or fail based on how it uses the database. As a corollary to this, all applications are built around databases; I can’t think of a single useful application that doesn’t store data persistently somewhere.
Applications come, applications go. The data, however, lives forever. It is not about building applications; it really is about the data underneath these applications.
A development team needs at its heart a core of database-savvy coders who are responsible for ensuring the database logic is sound and the system is built to perform from day one. Tuning after the fact—tuning after deployment—means you did not build it that way.

These may seem like surprisingly obvious statements, but in my experience, too many people approach the database as if it were a black box—something that they don’t need to know about. Maybe they have a structured query language (SQL) generator that will save them from the hardship of having to learn SQL.

Note

When I mention “SQL” in this book, this is in reference to a structured query language tool, such as Oracle’s SQL*Plus utility. I point this out because sometimes Microsoft’s SQL Server is also referred to as “SQL,” especially in shops where Microsoft tools are widely used.

Maybe they figure they’ll just use it like a flat file and do “keyed reads.” Whatever they assume, I can tell you that thinking along these lines is most certainly misguided; you simply can’t get away with not understanding the database. This chapter will discuss why you need to know about the database, specifically why you need to understand

The database architecture, how it works, and what it looks like
What concurrency controls are and what they mean to you
How to tune your application from day one
How some things are implemented in the database, which is not necessarily the same as how you think they should be implemented
What features your database already provides and why it is generally better to use a provided feature than to build your own
Why you might want more than a cursory knowledge of SQL
That the DBA and developer staff are on the same team, not enemy camps trying to outsmart each other at every turn

Now this may seem like a long list of things to learn before you start, but consider this analogy for a second: if you were developing a highly scalable, enterprise application on a brand-new operating system (OS), what is the first thing you’d do? Hopefully you answered, “Find out how this new OS works, how things will run on it, and so on.” If that wasn’t your answer, you’d most likely fail.

Consider, for example, Windows vs. UNIX/Linux. If you are a long-time Windows programmer and were asked to develop a new application on the UNIX/Linux platform, you’d have to relearn a couple of things. Memory management is done differently. Building a server process is considerably different—under Windows, you would develop a single process, a single executable with many threads. Under UNIX/Linux, you wouldn’t develop a single stand-alone executable; you’d have many processes working together. It is true that both Windows and UNIX/Linux are operating systems. They both provide many of the same services to developers—file management, memory management, process management, security, and so on. However, they are very different architecturally—much of what you learned in the Windows environment won’t apply to UNIX/Linux (and vice versa, to be fair). You have to unlearn to be successful. The same is true of your database environment.

What is true of applications running natively on operating systems is true of applications that will run on a database: understanding that database is crucial to your success. If you don’t understand what your particular database does or how it does it, your application will fail. If you assume that because your application ran fine on SQL Server, it will necessarily run fine on Oracle, again your application is likely to fail. And, to be fair, the opposite is true—a scalable, well-developed Oracle application will not necessarily run on SQL Server without major architectural changes. Just as Windows and UNIX/Linux are both operating systems but fundamentally different, Oracle and SQL Server (pretty much any database could be noted here) are both databases but fundamentally different.

My Approach

Before we begin, I feel it is only fair that you understand my approach to development. I tend to take a database-centric approach to problems. If I can do it in the database, I will. There are a couple of reasons for this—the first and foremost being that I know that if I build functionality in the database, I can deploy it anywhere. I am not aware of a popular, commercially viable server operating system on which Oracle is not available—from Windows to dozens of UNIX/Linux systems—the same exact Oracle software and options are available. I frequently build and test solutions on my laptop, running Oracle 21c, 19c, and 12c under UNIX/Linux or Windows on a virtual machine. I can then deploy them on a variety of servers running the same database software but different operating systems. When I have to implement a feature outside of the database, I find it extremely hard to deploy that feature anywhere I want. One of the main features that makes the Java language appealing to many people—the fact that their programs are always compiled in the same virtual environment, the Java virtual machine (JVM), and so are highly portable—is the exact same feature that makes the database appealing to me. The database is my virtual machine. It is my virtual operating system.

So I try to do everything I can in the database. If my requirements go beyond what the database environment can offer, I do it in Java outside of the database. In this way, almost every operating system intricacy will be hidden from me. I still have to understand how my “virtual machines” work (Oracle, and occasionally a JVM)—you need to know the tools you are using—but they, in turn, worry about how best to do things on a given OS for me.

Thus, simply knowing the intricacies of this one “virtual OS” allows you to build applications that will perform and scale well on many operating systems. I don’t mean to imply that you can be totally ignorant of your underlying OS, just that as a software developer building database applications you can be fairly well insulated from it, and you will not have to deal with many of its nuances. Your DBA, responsible for running the Oracle software, will be infinitely more in tune with the OS (if they are not, please get a new DBA!). If you develop client-server software and the bulk of your code is outside of the database and outside of a VM (Java virtual machines being perhaps the most popular VM), of course you’ll have to be concerned about your OS once again.

I have a pretty simple mantra when it comes to developing database software, one that has been consistent for many years:

You should do it in a single SQL statement if at all possible. And believe it or not, it is almost always possible. This statement is even truer as time goes on. SQL is an extremely powerful language.
If you can’t do it in a single SQL statement, do it in PL/SQL—as little PL/SQL as possible! Follow the saying that goes “more code = more bugs, less code = less bugs.”
If you can’t do it in PL/SQL, do it in a C external procedure. This is most frequently the approach when raw speed or using a third-party API written in C is needed.
If you can’t do it in a C external routine, you might want to seriously think about why it is you need to do it.

Throughout this book, you will see the preceding philosophy implemented. We’ll use PL/SQL—and object types in PL/SQL—to do things that SQL itself can’t do or can’t do efficiently. PL/SQL has been around for a very long time—over 34 years of tuning (as of 2022) has gone into it; in fact, way back in Oracle 10g, the PL/SQL compiler itself was rewritten to be an optimizing compiler for the first time. You’ll find no other language so tightly coupled with SQL, nor any as optimized to interact with SQL. Working with SQL in PL/SQL is a very natural thing—whereas in virtually every other language from Visual Basic to Java, using SQL can feel cumbersome. It never quite feels “natural”—it’s not an extension of the language itself. When PL/SQL runs out of steam—which is exceedingly rare today with current database releases—we’ll use Java. Occasionally, we’ll do something in C, but typically only when C is the only choice, or when the raw speed offered by C is required. Often, this last reason goes away with native compilation of Java—the ability to convert your Java bytecode into operating system–specific object code on your platform. This lets Java run just as fast as C in many cases.

The Black Box Approach

I have an idea, borne out by first-hand personal experience (meaning I made the mistake myself), as to why database-backed software development efforts so frequently fail. Let me be clear that I’m including here those projects that may not be documented as failures, but nevertheless take much longer to roll out and deploy than originally planned because of the need to perform a major rewrite, re-architecture, or tuning effort. Personally, I call such delayed projects failures: more often than not they could have been completed on schedule (or even faster).

The single most common reason for failure is a lack of practical knowledge of the database—a basic lack of understanding of the fundamental tool that is being used. The black box approach involves a conscious decision to protect the developers from the database. They are actually encouraged not to learn anything about it! In many cases, they are prevented from exploiting it. The reasons for this approach appear to be FUD related (Fear, Uncertainty, and Doubt). Developers have heard that databases are “hard,” that SQL, transactions, and data integrity are “hard.” The solution: don’t make anyone do anything hard. They treat the database as a black box and have some software tool generate all of the code. They try to insulate themselves with many layers of protection so that they don’t have to touch this “hard” database.

This is an approach to database development that I’ve never been able to understand, in part because, for me, learning Java and C was a lot harder than learning the concepts behind the database. I’m now pretty good at Java and C, but it took a lot more hands-on experience for me to become competent using them than it did to become competent using the database. With the database, you need to be aware of how it works, but you don’t have to know everything inside and out. When programming in C or Java/J2EE, you do need to know everything inside and out—and these are huge languages.

If you are building a database application, the most important piece of software is the database. A successful development team will appreciate this and will want its people to know about it, to concentrate on it. Many times I’ve walked into a project where almost the opposite was true. A typical scenario would be as follows:

The developers were fully trained in the GUI tool or the language they were using to build the front end (such as Java). In many cases, they had had weeks if not months of training in it.
The team had zero hours of Oracle training and zero hours of Oracle experience. Most had no database experience whatsoever. They would also have a mandate to be “database independent”—a mandate (edict from management or learned through theoretical academic instruction) they couldn’t hope to follow for many reasons. The most obvious one is they didn’t know enough about what databases are or what they do to even find the lowest common denominator among them.
The developers encountered massive performance problems, data integrity problems, hanging issues, and the like (but very pretty screens).

As a result of the inevitable performance problems, I now get called in to help solve the difficulties (in the past, as a learning developer I was sometimes the cause of such issues).

Note

Even today, I often find that the developers of database applications have spent no time reading the documentation. On my website, asktom.oracle.com, I frequently get questions along the lines of “what is the syntax for…” coupled with “we don’t have the documentation so please just tell us.” I refuse to directly answer many of those questions, but rather point them to the online documentation freely available to anyone, anywhere in the world. In the last 15 years, the excuses like “We don’t have documentation,” or “We don’t have access to resources,” have disappeared. Sites like www.oracle.com/technical-resources (Oracle Technical Resources, formerly known as the Oracle Technology Network) make it inexcusable to not have a full set of documentation at your fingertips! Today, everyone has access to all of the documentation; they just have to read it or—even easier—Google it.

The very idea that developers building a database application should be shielded from the database is amazing to me, but that attitude persists. Many people still insist that developers can’t take the time to get trained in the database and, basically, that they shouldn’t have to know anything about the database. Why? Well, more than once I’ve heard “but Oracle is the most scalable database in the world, my people don’t have to learn about it, it’ll just work.” That’s true; Oracle is the most scalable database in the world. However, I can write bad code that does not scale in Oracle as easily—if not more easily—as I can write good, scalable code in Oracle. You can replace Oracle with any piece of software and the same is true. This is a fact: it is easier to write applications that perform poorly than it is to write applications that perform well. It is sometimes too easy to build a single-user system in the world’s most scalable database if you don’t know what you are doing. The database is a tool, and the improper use of any tool can lead to disaster. Would you take a nutcracker and smash walnuts with it as if it were a hammer? You could, but it wouldn’t be a proper use of that tool and the result would be a mess (and probably some seriously hurt fingers). Similar effects can be achieved by remaining ignorant of your database.

I was called into a project that was in trouble. The developers were experiencing massive performance issues—it seemed their system was serializing many transactions, that is to say—so instead of many people working concurrently, everyone was getting into a really long line and waiting for everyone in front of them to complete. The application architects walked me through the architecture of their system—the classic three-tier approach. They would have a web browser talk to a middle tier application server running Java Server Pages (JSPs). The JSPs would in turn utilize another layer—Enterprise JavaBeans (EJBs)—that did all of the SQL. The SQL in the EJBs was generated by a third-party tool and was done in a database-independent fashion.

Now, in this system it was very hard to diagnose anything, as none of the code was instrumented or traceable. Instrumenting code is the fine art of making every other line of developed code be debug code of some sort—so when you are faced with performance or capacity or even logic issues, you can track down exactly where the problem is. In this case, we could only locate the problem somewhere between the browser and the database—in other words, the entire system was suspect. The Oracle database is heavily instrumented, but the application needs to be able to turn the instrumentation on and off at appropriate points—something it was not designed to do.

So, we were faced with trying to diagnose a performance issue with not too many details, just what we could glean from the database itself. Fortunately, in this case it was fairly easy. When someone who knew the Oracle V$ tables (the V$ tables are one way Oracle exposes its instrumentation, its statistics, to us) reviewed them, it became apparent that the major contention was around a single table—a queue table of sorts. The application would place records into this table, while another set of processes would pull the records out of this table and process them. Digging deeper, we found a bitmap index on a column in this table (See Chapter 11 on indexing for more information about bitmapped indexes). The reasoning was that this column, the processed-flag column, had only two values—Y and N. As records were inserted, they would have a value of N for not processed. As the other processes read and processed the record, they would update the N to Y to indicate that processing was done. The developers needed to find the N records rapidly and hence knew they wanted to index that column. They had read somewhere that bitmap indexes are for low-cardinality columns—columns that have but a few distinct values—so it seemed a natural fit. (Go ahead, use Google to search for when to use bitmap indexes; low cardinality will be there over and over. Fortunately, there are also many articles refuting that too simple concept today.)

But that bitmap index was the cause of all of their problems. In a bitmap index, a single key entry points to many rows, hundreds or more of them. If you update a bitmap index key (and thus locking it), the hundreds of records that key points to are effectively locked as well. So, someone inserting the new record with N would lock the N record in the bitmap index, effectively locking hundreds of other N records as well. Meanwhile, the process trying to read this table and process the records would be prevented from modifying some N record to be a Y (processed) record, because in order for it to update this column from N to Y, it would need to lock that same bitmap index key. In fact, other sessions just trying to insert a new record into this table would be blocked as well, as they would be attempting to lock the same bitmap key entry. In short, the developers had created a table that at most one person would be able to insert or update against at a time! We can see this easily using a simple scenario.

Note

If you haven’t done so already, visit the “Setting Up Your Environment” section of the front matter of this book. This section contains the code to create the EODA and SCOTT users. These users are used extensively in the examples in this book. The “Setting Up Your Environment” section also contains the source code for many of the utilities used throughout this book. For your convenience, the setup source code can also be downloaded/cloned from the GitHub site.

Here, I will use an autonomous transaction in the database to have two concurrent transactions in a single session. An autonomous transaction starts a “subtransaction” separate and distinct from any already established transaction in the session. The autonomous transaction behaves as if it were in an entirely different session—for all intents and purposes, the parent transaction is suspended. The autonomous transaction can be blocked by the parent transaction (as we’ll see), and, further, the autonomous transaction can’t see uncommitted modifications made by the parent transaction. For example, connecting to my pluggable database PDB1:

$ sqlplus eoda/foo@PDB1

SQL> create table t(processed_flag varchar2(1));

Table created.

SQL> create bitmap index t_idx on t(processed_flag);

Index created.

SQL> insert into t values ( 'N' );

1 row created.

SQL> decl are

pragma autonomous_transaction;

begin

insert into t values ( 'N' );

commit;

end;

declare

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at line 4

Note I will use autonomous transactions throughout this book to demonstrate locking, blocking, and concurrency issues. It is my firm belief that autonomous transactions are a feature that Oracle should not have exposed to developers—for the simple reason that most developers do not know when and how to use them properly. The improper use of an autonomous transaction can and will lead to logical data integrity corruption issues. Beyond using them as a demonstration tool, autonomous transactions have exactly one other use—as an error-logging mechanism. If you wish to log an error in an exception block, you need to log that error into a table and commit it—without committing anything else. That would be a valid use of an autonomous transaction. If you find yourself using an autonomous transaction outside the scope of logging an error or demonstrating a concept, you are almost surely doing something very wrong.

Since I used an autonomous transaction and created a subtransaction, I received a deadlock—meaning my second insert was blocked by my first insert. Had I used two separate sessions, no deadlock would have occurred. Instead, the second insert would have been blocked and waited for the first transaction to commit or roll back. This symptom is exactly what the project in question was facing—the blocking, serialization issue.

So we had an issue whereby not understanding the database feature (bitmap indexes) and how it worked doomed the database to poor scalability from the start. To further compound the problem, there was no reason for the queuing code to ever have been written. The Oracle database has built-in queuing capabilities. This built-in queuing feature gives you the ability to have many producers (the sessions that insert the N, the unprocessed records) concurrently put messages into an inbound queue and have many consumers (the sessions that look for N records to process) concurrently receive these messages. That is, no special code should have been written in order to implement a queue in the database. The developers should have used the built-in feature. And they might have, except they were completely unaware of it.

Fortunately, once this issue was discovered, correcting the problem was easy. We did need an index on the processed-flag column, just not a bitmap index. We needed a conventional B*Tree index. It took a bit of convincing to get one created. No one wanted to believe that conventionally indexing a column with two distinct values was a good idea. But after setting up a simulation (I am very much into simulations, testing, and experimenting), we were able to prove it was not only the correct approach but also that it would work very nicely.

Note

We create indexes, indexes of any type, typically to find a small number of rows in a large set of data. In this case, the number of rows we wanted to find via an index was one. We needed to find one unprocessed record. One is a very small number of rows; therefore, an index is appropriate. An index of any type would be appropriate. The B*Tree index was very useful in finding a single record out of a large set of records.

When we created the index, we had to choose between the following approaches:

Just create an index on the processed-flag column.
Create an index only on the processed-flag column when the processed flag is N, that is, only index the values of interest. We typically don’t want to use an index when the processed flag is Y since the vast majority of the records in the table have the value Y. Notice that I did not say “We never want to use….” You might want to very frequently count the number of processed records for some reason, and then an index on the processed records might well come in very handy.

In Chapter 11 on indexing, we’ll go into more detail on both types. In the end, we created a very small index on just the records where the processed flag was N. Access to those records was extremely fast, and the vast majority of Y records did not contribute to this index at all. We used a function-based index on a function decode( processed_flag, 'N', 'N' ) to return either N or NULL—since an entirely NULL key is not placed into a conventional B*Tree index, we ended up only indexing the N records.

Note

There is more information on NULLs and indexing in Chapter 11.

Was that the end of the story? No, not at all. My client still had a less than optimal solution on its hands. They still had to serialize on the “dequeue” of an unprocessed record. We could easily find the first unprocessed record—quickly—using select * from queue_table where decode( processed_flag, 'N', 'N') = 'N' FOR UPDATE, but only one session at a time could perform that operation. The project was using Oracle 10g and therefore could not yet make use of the relatively new SKIP LOCKED feature added in Oracle 11g. SKIP LOCKED would permit many sessions to concurrently find the first unlocked, unprocessed record, lock that record, and process it. Instead, we had to implement code to find the first unlocked record and lock it manually. Such code would generally look like the following in Oracle 10g and before. We begin by creating a table with the requisite index described earlier and populate it with some data, as follows:

SQL> drop table t purge;

SQL> create table t ( id number primary key,

processed_flag varchar2(1),

payload varchar2(20));

Table created.

SQL> create index t_idx on t( decode( processed_flag, 'N', 'N' ) );

Index created.

SQL> insert into t

select r,

case when mod(r,2) = 0 then 'N' else 'Y' end,

'payload ' || r

from (select level r

from dual

connect by level <= 5);

5 rows created.

SQL> select * from t;

ID P PAYLOAD

---------- - --------------------

1 Y payload 1

2 N payload 2

3 Y payload 3

4 N payload 4

5 Y payload 5

Then we basically need to find any and all unprocessed records. One by one we ask the database “Is this row locked already? If not, then lock it and give it to me.” That code would look like this:

SQL> create or replace

function get_first_unlocked_row

return t%rowtype

resource_busy exception;

pragma exception_init( resource_busy, -54 );

l_rec t%rowtype;

begin

for x in ( select rowid rid

from t

where decode(processed_flag,'N','N') = 'N')

loop

begin

select * into l_rec

from t

where rowid = x.rid and processed_flag='N'

for update nowait;

return l_rec;

exception

when resource_busy then null;

when no_data_found then null;

end;

end loop;

return null;

end;

Function created.

Note

In the preceding code, I ran some DDL—the CREATE OR REPLACE FUNCTION. Right before DDL runs, it automatically commits, so there was an implicit COMMIT in there. The rows we’ve inserted are committed in the database—and that fact is necessary for the following examples to work correctly. In general, I’ll use that fact in the remainder of the book. If you run these examples without performing the CREATE OR REPLACE, make sure to COMMIT first!

Now, if we use two different transactions, we can see that both get different records. We also see that both get different records concurrently (using autonomous transactions once again to demonstrate the concurrency issues):

SQL> set serverout on

SQL> declare

l_rec t%rowtype;

begin

l_rec := get_first_unlocked_row;

dbms_output.put_line( 'I got row ' || l_rec.id || ', ' || l_rec.payload );

end;

I got row 2, payload 2

PL/SQL procedure successfully completed.

SQL> declare

pragma autonomous_transaction;

l_rec t%rowtype;

begin

l_rec := get_first_unlocked_row;

dbms_output.put_line( 'I got row ' || l_rec.id || ', ' || l_rec.payload );

commit;

end;

I got row 4, payload 4

PL/SQL procedure successfully completed.

Now, in Oracle 11g and above, we can achieve the preceding logic using the SKIP LOCKED clause. In the following example, we’ll do two concurrent transactions again, observing that they each find and lock separate records concurrently:

SQL> declare

l_rec t%rowtype;

cursor c

select *

from t

where decode(processed_flag,'N','N') = 'N'

FOR UPDATE

SKIP LOCKED;

begin

open c;

fetch c into l_rec;

if ( c%found )

then

dbms_output.put_line( 'I got row ' || l_rec.id || ', ' || l_rec.payload );

end if;

close c;

end;

I got row 2, payload 2

PL/SQL procedure successfully completed.

SQL> declare

pragma autonomous_transaction;

l_rec t%rowtype;

cursor c

select *

from t

where decode(processed_flag,'N','N') = 'N'

FOR UPDATE

SKIP LOCKED;

begin

open c;

fetch c into l_rec;

if ( c%found )

then

dbms_output.put_line( 'I got row ' || l_rec.id || ', ' || l_rec.payload );

end if;

close c;

commit;

end;

I got row 4, payload 4

PL/SQL procedure successfully completed.

Both of the preceding “solutions” would help to solve the second serialization problem my client was having when processing messages. But how much easier would the solution have been if my client had just used Advanced Queuing and invoked DBMS_AQ.DEQUEUE? To fix the serialization issue for the message producer, we had to implement a function-based index. To fix the serialization issue for the consumer, we had to use that function-based index to retrieve the records and write code. So we fixed their major problem, caused by not fully understanding the tools they were using and found only after lots of looking and study since the system was not nicely instrumented. What we hadn’t fixed yet were the following issues:

The application was built without a single consideration for scaling at the database level.
The application was performing functionality (the queue table) that the database already supplied in a highly concurrent and scalable fashion. I’m referring to the Advance Queuing (AQ) software that is burned into the database, functionality they were trying to reinvent.
Experience shows that 80 to 90 percent (or more!) of all tuning should be done at the application level (typically the interface code reading and writing to the database), not at the database level.
The developers had no idea what the beans did in the database or where to look for potential problems.

This was hardly the end of the problems on this project. We also had to figure out the following:

How to tune SQL without changing the SQL. In general, that is very hard to do. We can accomplish this magic feat to some degree with SQL Profiles (this option requires a license for the Oracle Tuning Pack), with extended statistics, and with adaptive query optimization. But inefficient SQL will remain inefficient SQL.
How to measure performance.
How to see where the bottlenecks were.
How and what to index. And so on.

At the end of the week, the developers, who had been insulated from the database, were amazed at what the database could actually provide for them and how easy it was to get that information. Most importantly, they saw how big of a difference taking advantage of database features could make to the performance of their application. In the end, they were successful—just behind schedule by a couple of weeks.

My point about the power of database features is not a criticism of tools or technologies like Hibernate, EJBs, and container-managed persistence. It is a criticism of purposely remaining ignorant of the database and how it works and how to use it. The technologies used in this case worked well—after the developers got some insight into the database itself.

The bottom line is that the database is typically the cornerstone of your application. If it does not work well, nothing else really matters. If you have a black box and it does not work, what are you going to do about it? About the only thing you can do is look at it and wonder why it is not working very well. You can’t fix it; you can’t tune it. Quite simply, you do not understand how it works—and you made the decision to be in this position. The alternative is the approach that I advocate: understand your database, know how it works, know what it can do for you, and use it to its fullest potential.

How (and How Not) to Develop Database Applications

That’s enough hypothesizing, for now at least. In the remainder of this chapter, I will take a more empirical approach, discussing why knowledge of the database and its workings will definitely go a long way toward a successful implementation (without having to write the application twice!). Some problems are simple to fix as long as you understand how to find them. Others require drastic rewrites. One of the goals of this book is to help you avoid the problems in the first place.

Note

In the following sections, I will discuss certain core Oracle features without delving into exactly what these features are and all of the ramifications of using them. I will refer you either to a subsequent chapter in this book or to the relevant Oracle documentation for more information.

Understanding Oracle Architecture

I have worked with many customers running large production applications—applications that had been “ported” from another database (e.g., SQL Server) to Oracle. I quote “ported” simply because most ports I see reflect a “what is the least change we can make to have our SQL Server code compile and execute on Oracle” perspective. The applications that result from that line of thought are frankly the ones I see most often, because they are the ones that need the most help. I want to make clear, however, that I am not bashing SQL Server in this respect—the opposite is true! Taking an Oracle application and just plopping it down on top of SQL Server with as few changes as possible results in the same poorly performing code in reverse; the problem goes both ways.

In one particular case, however, the SQL Server architecture and how you use SQL Server really impacted the Oracle implementation. The stated goal was to scale up, but these folks did not want to really port to another database. They wanted to port with as little work as humanly possible, so they kept the architecture basically the same in the client and database layers. This decision had two important ramifications:

The connection architecture was the same in Oracle as it had been in SQL Server.
The developers used literal (nonbound) SQL.

These two ramifications resulted in a system that could not support the required user load (the database server simply ran out of available memory) and in a system that had abysmal performance.

Use a Single Connection in Oracle

Now, in SQL Server it is a very common practice to open a connection to the database for each concurrent statement you want to execute. If you are going to do five queries, you might well see five connections in SQL Server. In Oracle, on the other hand, if you want to do 5 queries or 500, the maximum number of connections you want to open is one. So, a practice that is common in SQL Server is something that is not only not encouraged in Oracle, it is actively discouraged; having multiple connections to the database (when you can use just one) is just something you don’t want to do.

But do it they did. A simple web-based application would open 5, 10, 15, or more connections per web page, meaning that their server could support only 1/5, 1/10, or 1/15 the number of concurrent users that it should have been able to. My recommendation to them was to re-architect the application to allow it to take advantage of the connection to generate a page, not somewhere between 5 and 15 connections. This is the only solution that would actually solve the problem. As you can imagine, this is not an “OK, we’ll do that this afternoon” sort of solution. It is a nontrivial solution to a problem that could have most easily been corrected during the database port phase while you were in the code poking around and changing things in the first place. Furthermore, a simple test to scale before rolling out to production would have caught such issues prior to the end users feeling the pain.

Use Bind Variables

If I were to write a book about how to build nonscalable Oracle applications, “Don’t Use Bind Variables” would be the first and last chapter. Not using bind variables is a major cause of performance issues and a major inhibitor of scalability—not to mention a security risk of huge proportions. The way the Oracle shared pool (a very important shared memory data structure) operates is predicated on developers using bind variables in most cases. If you want to make a transactional Oracle implementation run slowly, even grind to a total halt, just refuse to use them.

A bind variable is a placeholder in a query. For example, to retrieve the record for employee 123, I can query

SQL> select * from emp where empno = 123;

Alternatively, I can query

SQL> select * from emp where empno = :empno;

In a typical system, you would query up employee 123 maybe once or twice and then never again for a long period of time. Later, you would query up employee 456, then 789, and so on. Or, foregoing SELECT statements, if you do not use bind variables in your insert statements, your primary key values will be hard-coded in them, and I know for a fact that these insert statements can’t ever be reused later!!! If you use literals (constants) in the query, then every query is a brand-new query, never before seen by the database. It will have to be parsed, qualified (names resolved), security-checked, optimized, and so on. In short, each and every unique statement you execute will have to be compiled every time it is executed.

The second query uses a bind variable, :empno, the value of which is supplied at query execution time. This query is compiled once, and then the query plan is stored in a shared pool (the library cache), from which it can be retrieved and reused. The difference between the two in terms of performance and scalability is huge, dramatic even.

From the preceding description, it should be fairly obvious that parsing unique statements with hard-coded variables (called a hard parse) will take longer and consume many more resources than reusing an already parsed query plan (called a soft parse). What may not be so obvious is the extent to which the former will reduce the number of users your system can support. Obviously, this is due in part to the increased resource consumption, but an even more significant factor arises due to the latching mechanisms for the library cache. When you hard parse a query, the database will spend more time holding certain low-level serialization devices called latches (see Chapter 6 for more details). These latches protect the data structures in Oracle’s shared memory from concurrent modifications by two sessions (otherwise, Oracle would end up with corrupt data structures) and from someone reading a data structure while it is being modified. The longer and more frequently you have to latch these data structures, the longer the queue to get these latches will become. You will start to monopolize scarce resources. Your machine may appear to be underutilized at times, and yet everything in the database is running very slowly. The likelihood is that someone is holding one of these serialization mechanisms and a line is forming—you are not able to run at top speed. It only takes one ill-behaved application in your database to dramatically affect the performance of every other application. A single, small application that does not use bind variables will cause the relevant SQL of other well-tuned applications to get discarded from the shared pool over time. You only need one bad apple to spoil the entire barrel.

If you use bind variables, then everyone who submits the same exact query that references the same object will use the compiled plan from the pool. You will compile your subroutine once and use it over and over again. This is very efficient and is the way the database intends you to work. Not only will you use fewer resources (a soft parse is much less resource-intensive), but also you will hold latches for less time and need them less frequently. This increases your performance and greatly increases your scalability.

Just to give you a tiny idea of how huge a difference this can make performance-wise, you only need to run a very small test. In this test, we’ll just be inserting some rows into a table; the simple table we will use is

SQL> drop table t purge;

SQL> create table t ( x int );

Table created.

Now we’ll create two very simple stored procedures. They both will insert the numbers 1 through 10,000 into this table; however, the first procedure uses a single SQL statement with a bind variable:

SQL> create or replace procedure proc1

begin

for i in 1 .. 10000

loop

execute immediate

'insert into t values ( :x )' using i;

end loop;

end;

Procedure created.

The second procedure constructs a unique SQL statement for each row to be inserted:

SQL> create or replace procedure proc2

begin

for i in 1 .. 10000

loop

execute immediate

'insert into t values ( '||i||')';

end loop;

end;

Procedure created.

Now, the only difference between the two is that one uses a bind variable and the other does not. Both are using dynamic SQL and the logic is otherwise identical. The only difference is the use of a bind variable in the first.

Note

For details on runstats and other utilities, see the “Setting Up Your Environment” section at the beginning of this book. You may not observe exactly the same values for CPU or any metric. Differences are caused by different Oracle versions, different operating systems, and different hardware platforms. The idea will be the same, but the exact numbers will undoubtedly be marginally different.

We are ready to evaluate the two approaches, and we’ll use runstats, a simple tool I’ve developed, to compare the two in detail:

SQL> set serverout on

SQL> exec runstats_pkg.rs_start

PL/SQL procedure successfully completed.

SQL> exec proc1

PL/SQL procedure successfully completed.

SQL> exec runstats_pkg.rs_middle

PL/SQL procedure successfully completed.

SQL> exec proc2

PL/SQL procedure successfully completed.

SQL> exec runstats_pkg.rs_stop(9500)

Run1 ran in 20 cpu hsecs

Run2 ran in 709 cpu hsecs

run 1 ran in 2.82% of the time

Now, the preceding result clearly shows that based on CPU time (measured in hundredths of seconds), it took significantly longer and significantly more resources to insert 10,000 rows without bind variables than it did with them. In fact, it took more than a magnitude more CPU time to insert the rows without bind variables. For every insert without bind variables, we spent the vast preponderance of the time to execute the statement simply parsing the statement! But it gets worse. When we look at other information, we can see a significant difference in the resources utilized by each approach:

Name Run1 Run2 Diff

LATCH.KJCT flow control latch 202 9,909 9,707

LATCH.object stats modificatio 9 9,768 9,759

STAT...session logical reads 10,827 20,671 9,844

STAT...db block gets from cach 10,587 20,442 9,855

STAT...db block gets 10,587 20,442 9,855

STAT...db block gets from cach 10,448 20,318 9,870

STAT...calls to kcmgcs 164 10,128 9,964

STAT...enqueue requests 44 10,017 9,973

STAT...enqueue releases 42 10,017 9,975

STAT...ASSM gsp:L1 bitmaps exa 21 10,004 9,983

STAT...ASSM gsp:get free block 16 10,000 9,984

STAT...ASSM cbk:blocks examine 16 10,000 9,984

STAT...ASSM gsp:good hint 15 10,000 9,985

STAT...calls to get snapshot s 47 10,040 9,993

STAT...parse count (hard) 3 10,003 10,000

STAT...session cursor cache hi 10,030 29 -10,001

LATCH.shardgroup list latch 3 10,005 10,002

STAT...parse count (total) 24 10,028 10,004

LATCH.session idle bit 33 14,963 14,930

LATCH.checkpoint queue latch 19,403 4,230 -15,173

LATCH.gcs resource hash 327 16,879 16,552

LATCH.name-service namespace b 106 18,200 18,094

LATCH.PDB Hash Table Latch 8 18,579 18,571

LATCH.parallel query alloc buf 1 18,739 18,738

LATCH.object queue header oper 1,783 22,378 20,595

LATCH.object queue header free 38 21,643 21,605

LATCH.shared pool simulator 21 21,735 21,714

STAT...file io wait time 32,001 8,369 -23,632

LATCH.ASM map operation hash t 570 27,386 26,816

LATCH.gc element 501 28,119 27,618

LATCH.post/wait queue 2,824 32,126 29,302

STAT...recursive calls 10,113 40,202 30,089

LATCH.JS queue state obj latch 3,672 34,884 31,212

LATCH.gcs partitioned table ha 210 33,281 33,071

STAT...global enqueue releases 454 60,423 59,969

STAT...global enqueue gets syn 455 60,426 59,971

LATCH.active service list 3,698 64,116 60,418

LATCH.pdb domain request queue 4,464 81,550 77,086

LATCH.enqueue hash chains 8,053 116,473 108,420

LATCH.ksim group membership ca 41 155,915 155,874

STAT...cell physical IO interc 655,360 851,968 196,608

STAT...physical read total byt 655,360 851,968 196,608

STAT...physical read bytes 655,360 851,968 196,608

LATCH.cache buffers chains 55,873 272,739 216,866

LATCH.ges group table 5,065 223,153 218,088

STAT...physical read total byt 630,784 851,968 221,184

LATCH.ges domain table 6,410 277,980 271,570

LATCH.recovery domain hash buc 15,873 290,222 274,349

LATCH.ges resource hash list 6,829 295,246 288,417

LATCH.process queue reference 1 326,406 326,405

LATCH.ges process parent latch 10,099 435,392 425,293

LATCH.shared pool 295 517,197 516,902

STAT...session uga memory 0 -523,840 -523,840

STAT...session pga memory -65,536 -720,896 -655,360

STAT...logical read bytes from 88,612,864 169,402,368 80,789,504

Run1 latches total versus runs -- difference and pct

Run1 Run2 Diff Pct

156,002 3,498,539 3,342,537 4.46%

PL/SQL procedure successfully completed.

The runstats utility produces a report that shows differences in latch utilization as well as differences in statistics. Here, I asked runstats to print out anything with a difference greater than 9500. You can see that we hard parsed 3 times in the first approach using bind variables and that we hard parsed 10,000 times without bind variables (once for each of the inserts). But that difference in hard parsing is just the tip of the iceberg. You can see here that we used an order of magnitude as many “latches” in the nonbind variable approach as we did with bind variables. That difference might beg the question “What is a latch?”

Let’s answer that question. A latch is a type of lock that is used to serialize access to shared data structures used by Oracle. The shared pool is an example; it’s a big, shared data structure found in the System Global Area (SGA), and this is where Oracle stores parsed, compiled SQL. When you modify anything in this shared structure, you must take care to allow only one process in at a time. (It is very bad if two processes or threads attempt to update the same in-memory data structure simultaneously—corruption would abound.) So, Oracle employs a latching mechanism, a lightweight locking method to serialize access. Don’t be fooled by the word lightweight. Latches are serialization devices, allowing access (to a memory structure) one process at a time. The latches used by the hard parsing implementation are some of the most used latches out there. These include the latches for the shared pool and for the library cache. Those are “big time” latches that people compete for frequently. What all this means is that as we increase the number of users attempting to hard parse statements simultaneously, our performance gets progressively worse over time. The more people parsing, the more people waiting in line to latch the shared pool, the longer the queues, the longer the wait.

Executing SQL statements without bind variables is very much like compiling a subroutine before each method call. Imagine shipping Java source code to your customers where, before calling a method in a class, they had to invoke the Java compiler, compile the class, run the method, and then throw away the bytecode. Next time they wanted to execute the same method, they would do the same thing: compile it, run it, and throw it away. You would never consider doing this in your application; you should never consider doing this in your database either.

Another impact of not using bind variables, for developers employing string concatenation, is security—specifically something called SQL injection. If you are not familiar with this term, I encourage you to put aside this book for a moment and, using the search engine of your choice, look up SQL injection. There are over five million hits returned for it as I write this edition. The problem of SQL injection is well documented.

Note

SQL injection is a security hole whereby the developer accepts input from an end user and concatenates that input into a query, then compiles and executes that query. In effect, the developer accepts snippets of SQL code from the end user, then compiles and executes those snippets. That approach allows the end user to potentially modify the SQL statement so that it does something the application developer never intended. It’s almost like leaving a terminal open with a SQL*Plus session logged in and connected with SYSDBA privileges. You are just begging someone to come by and type in some command, compile it, and then execute it. The results can be disastrous.

It is a fact that if you do not use bind variables, that if you use the string concatenation technique in PROC2 shown earlier, your code is subject to SQL injection attacks and must be carefully reviewed. And it should be reviewed by people who don’t actually like the developer who wrote the code—because the code must be reviewed critically and objectively. If the reviewers are peers of the code author, or worse, friends or subordinates, the review will not be as critical as it should be. Developed code that does not use bind variables must be viewed with suspicion—it should be the exceptional case where bind variables are not used, not the norm.

To demonstrate how insidious SQL injection can be, I present this small example. First, I’ll create a user in my pluggable database that has the DBA role:

$ sqlplus system/foo@PDB1

SQL> create user pwd_mgr identified by pwd_mgr_pwd;

User created.

SQL> grant unlimited tablespace to pwd_mgr;

SQL> grant dba to pwd_mgr with admin option;

Grant succeeded.

Note

If the DBA role isn’t grantable in your PDB, then grant the PDB_DBA role to the pwd_mgr user instead.

Next, connect to the user and create a small procedure:

SQL> conn pwd_mgr/pwd_mgr_pwd@PDB1

SQL> create or replace procedure inj( p_date in date )

l_username all_users.username%type;

c sys_refcursor;

l_query varchar2(4000);

begin

l_query := '

select username

from all_users

where created = ''' ||p_date ||'''';

dbms_output.put_line( l_query );

open c for l_query;

for i in 1 .. 5

loop

fetch c into l_username;

exit when c%notfound;

dbms_output.put_line( l_username || '.....' );

end loop;

close c;

end;

Procedure created.

Note

This code prints out only five records at most. It was developed to be executed in an “empty” schema. A schema with lots of existing tables could cause various effects that differ from the results shown next. One effect could be that you don’t see the table I’m trying to show you in the example—that would be because we print out only five records. Another might be a numeric or value error—that would be due to a long table name. None of these facts invalidate the example; they could all be worked around by someone wanting to steal your data.

Now, most developers I know would look at that code and say that it’s safe from SQL injection. They would say this because the input to the routine must be an Oracle DATE variable, a 7-byte binary format representing a century, year, month, day, hour, minute, and second. There is no way that DATE variable could change the meaning of my SQL statement. As it turns out, they are very wrong. This code can be “injected”—modified at runtime, easily—by anyone who knows how (and, obviously, there are people who know how!). If you execute the procedure the way the developer “expects” the procedure to be executed, this is what you might expect to see:

SQL> set serverout on

SQL> exec inj( sysdate )

select *

from all_users

where created = '05-FEB-21'

PL/SQL procedure successfully completed.

This result shows the SQL statement being safely constructed—as expected. So, how could someone use this routine in a nefarious way? Well, suppose you’ve got another developer in this project—the evil developer. The developers have access to execute that procedure, to see the users created in the database today, but they don’t have access to any of the other tables in the schema that owns this procedure. Now, they don’t know what tables exist in this schema—the security team has decided “security via obscurity” is good—so they don’t allow anyone to publish the table names anywhere. So, they don’t know that the following table in particular exists:

SQL> create table user_pw

( uname varchar2(30) primary key,

pw varchar2(30));

Table created.

SQL> insert into user_pw ( uname, pw ) values ( 'TKYTE', 'TOP SECRET' );

1 row created.

SQL> commit;

Commit complete.

The prior USER_PW table looks like a pretty important table, but remember, users do not know it exists. However, they (users with minimal privileges) do have access to the INJ routine:

SQL> conn system/foo@PDB1

Connected.

SQL> create user dev identified by dev_pwd;

User created.

SQL> grant create session to dev;

Grant succeeded.

SQL> grant create procedure to dev;

Grant succeeded.

SQL> grant execute on pwd_mgr.inj to dev;

Grant succeeded.

So the evil developer/user can simply execute

SQL> connect dev/dev_pwd@PDB1

Connected.

SQL> alter session set nls_date_format = '"''union select tname from tab--"';

Session altered.

SQL> set serverout on

SQL> exec pwd_mgr.inj( sysdate )

Here's the output:

select username

from all_users

where

created = ''union select tname from tab--'

USER_PW.....

PL/SQL procedure successfully completed.

In the prior code, the select statement executes this statement (which returns no rows):

select username from all_users where created =''

And it unions that with

select tname from tab

Take a look at the last --' bit. In SQL*Plus, a double dash is a comment; so this is commenting out the last quote mark, which is necessary to make the statement syntactically correct.

Now, that NLS_DATE_FORMAT is interesting—most people don’t even know you can include character string literals with the NLS_DATE_FORMAT. (Heck, many people don’t even know you can change the date format like that even without this “trick.” Nor do they know that you can alter your session (to set the NLS_DATE_FORMAT) even without the ALTER SESSION privilege!) What the malicious user did here was to trick your code into querying a table you did not intend them to query using your set of privileges. The TAB dictionary view limits its view to the set of tables the current schema can see. When users run the procedure, the current schema used for authorization is the owner of that procedure (you, in short, not them). They can now see what tables reside in that schema. They see that table USER_PW and say, “Hmmm, sounds interesting.” So, they try to access that table:

SQL> select * from pwd_mgr.user_pw;

select * from pwd_mgr.user_pw

ERROR at line 1:

ORA-00942: table or view does not exist

The malicious user can’t access the table directly; they lack the SELECT privilege on the table. Not to worry, however, there is another way. The user wants to know about the columns in the table. Here’s one way to find out more about the table’s structure:

SQL> alter session set nls_date_format = '"''union select tname||''/''||cname from col--"';

Session altered.

SQL> exec pwd_mgr.inj( sysdate )

select username

from all_users

where

created = ''union select tname||'/'||cname from col--'

USER_PW/PW.....

USER_PW/UNAME.....

PL/SQL procedure successfully completed.

There we go, we know the column names. Now that we know the table names and the column names of tables in that schema, we can change the NLS_DATE_FORMAT one more time to query that table—not the dictionary tables. So the malicious user can next do the following:

SQL> alter session set nls_date_format = '"''union select uname||''/''||pw from user_pw--"';

Session altered.

SQL> exec pwd_mgr.inj( sysdate )

select username

from all_users

where

created = ''union select uname||'/'||pw from user_pw--'

TKYTE/TOP SECRET.....

PL/SQL procedure successfully completed.

And there we go—that evil developer/user now has your sensitive username and password information. Going one step further, what if this developer has the CREATE PROCEDURE privilege? It is a safe assumption that they would (they are a developer after all). Could they go further with this example? Absolutely. That innocent-looking stored procedure gives guaranteed read access to everything the PWD_MGR schema has read access to, at a minimum; and if the account exploiting this bug has CREATE PROCEDURE (which it does), that stored procedure allows them to execute any command that PWD_MGR could execute!

Note

This example assumes that the user PWD_MGR has been granted the DBA role with the ADMIN OPTION.

And then as the developer, we’ll create a function that grants DBA. There are two important facts about this function: it is an invoker rights routine, meaning that it will execute with the privileges granted to the person executing the routine, and it is a pragma autonomous_transaction routine, meaning that it creates a subtransaction that will commit or roll back before the routine returns, therefore making it eligible to be called from SQL. Here is that function:

SQL> create or replace function foo

return varchar2

authid CURRENT_USER

pragma autonomous_transaction;

begin

execute immediate 'grant dba to dev';

return null;

end;

Function created.

Now all we have to do is “trick” PWD_MGR (a DBA that can grant DBA to others) into running this function. Given what we’ve done to exploit the SQL injection flaw, this is easy. We’ll set our NLS_DATE_FORMAT to include a reference to this function and grant execute on it to PWD_MGR :

SQL> alter session set nls_date_format = '"''union select dev.foo from dual--"';

Session altered.

SQL> grant execute on foo to pwd_mgr;

Grant succeeded.

And voilà! We have DBA:

SQL> select * from session_roles;

no rows selected

SQL> set serverout on

SQL> exec pwd_mgr.inj( sysdate )

select username

from all_users

where

created = ''union select dev.foo from dual--'

.....

PL/SQL procedure successfully completed.

Now we need to reconnect to the database so that the freshly granted privileges are instantiated for the session:

SQL> conn dev/dev_pwd@PDB1

SQL> select * from session_roles;

ROLE

---------------------------------------------------------------------------

PDB_DBA

CONNECT

RESOURCE

SODA_APP

SELECT_CATALOG_ROLE

HS_ADMIN_SELECT_ROLE

AQ_ADMINISTRATOR_ROLE

AQ_USER_ROLE

ADM_PARALLEL_EXECUTE_TASK

GATHER_SYSTEM_STATISTICS

So, how could you have protected yourself? By using bind variables. For example:

SQL> conn pwd_mgr/pwd_mgr_pwd@PDB1

SQL> create or replace procedure NOT_inj( p_date in date )

l_username all_users.username%type;

c sys_refcursor;

l_query varchar2(4000);

begin

l_query := '

select username

from all_users

where created = :x';

dbms_output.put_line( l_query );

open c for l_query USING P_DATE;

for i in 1 .. 5

loop

fetch c into l_username;

exit when c%notfound;

dbms_output.put_line( l_username || '.....' );

end loop;

close c;

end;

Procedure created.

SQL> set serverout on

SQL> alter session set nls_date_format = '"''union select dev.foo from dual--"';

Session altered.

SQL> exec NOT_inj(sysdate)

select username

from all_users

where created = :x

PL/SQL procedure successfully completed.

Notice the prior output does not contain any injected malicious code. It is a plain and simple fact that if you use bind variables you can’t be subject to SQL injection. If you do not use bind variables, you have to meticulously inspect every single line of code and think like an evil genius (one who knows everything about Oracle, every single thing) and see if there is a way to attack that code. I don’t know about you, but if I could be sure that 99.9999 percent of my code was not subject to SQL injection, and I only had to worry about the remaining 0.0001 percent (that couldn’t use a bind variable for whatever reason), I’d sleep much better at night than if I had to worry about 100 percent of my code being subject to SQL injection.

In any case, on the particular project I began describing at the beginning of this section, rewriting the existing code to use bind variables was the only possible course of action. The resulting code ran orders of magnitude faster and increased many times the number of simultaneous users that the system could support. And the code was more secure—the entire codebase did not need to be reviewed for SQL injection issues. However, that security came at a high price in terms of time and effort, because my client had to code the system and then code it again. It is not that using bind variables is hard, or error-prone, it’s just that they did not use them initially and thus were forced to go back and revisit virtually all of the code and change it. My client would not have paid this price if the developers had understood that it was vital to use bind variables in their application from day one.

Understanding Concurrency Control

Concurrency control is one area where databases differentiate themselves. It is an area that sets a database apart from a file system and databases apart from each other. As a programmer, it is vital that your database application works correctly under concurrent access conditions, and yet time and time again this is something people fail to test. Techniques that work well if everything happens consecutively do not necessarily work so well when everyone does them simultaneously. If you don’t have a good grasp of how your particular database implements concurrency control mechanisms, then you will

Corrupt the integrity of your data
Have applications run slower than they should with a small number of users
Decrease your applications’ ability to scale to a large number of users

Notice I don’t say, “you might…” or “you run the risk of…” but rather that invariably you will do these things. You will do these things without even realizing it. Without correct concurrency control, you will corrupt the integrity of your database because something that works in isolation will not work as you expect in a multiuser situation. Your application will run slower than it should because you’ll end up waiting for data. Your application will lose its ability to scale because of locking and contention issues. As the queues to access a resource get longer, the wait gets longer and longer.

An analogy here would be a backup at a tollbooth. If cars arrive in an orderly, predictable fashion, one after the other, there won’t ever be a backup. If many cars arrive simultaneously, queues start to form. Furthermore, the waiting time does not increase linearly with the number of cars at the booth. After a certain point, considerable additional time is spent “managing” the people who are waiting in line, as well as servicing them (the parallel in the database would be context switching).

Concurrency issues are the hardest to track down; the problem is similar to debugging a multithreaded program. The program may work fine in the controlled, artificial environment of the debugger but crashes horribly in the real world. For example, under race conditions, you find that two threads can end up modifying the same data structure simultaneously. These kinds of bugs are terribly hard to track down and fix. If you only test your application in isolation and then deploy it to dozens of concurrent users, you are likely to be (painfully) exposed to an undetected concurrency issue.

Over the next two sections, I’ll relate two small examples of how the lack of understanding concurrency control can ruin your data or inhibit performance and scalability.

Implementing Locking

The database uses locks to ensure that, at most, one transaction is modifying a given piece of data at any given time. Basically, locks are the mechanism that allows for concurrency—without some locking model to prevent concurrent updates to the same row, for example, multiuser access would not be possible in a database. However, if overused or used improperly, locks can actually inhibit concurrency. If you or the database itself locks data unnecessarily, fewer people will be able to concurrently perform operations. Thus, understanding what locking is and how it works in your database is vital if you are to develop a scalable, correct application.

What is also vital is that you understand that each database implements locking differently. Some have page-level locking, others row-level; some implementations escalate locks from row level to page level, some do not; some use read locks, others don’t; some implement serializable transactions via locking and others via read-consistent views of data (no locks). These small differences can balloon into huge performance issues or downright bugs in your application if you don’t understand how they work.

The following points sum up Oracle’s locking policy:

Oracle locks data at the row level on modification. There is no lock escalation to a block or table level.
Oracle never locks data just to read it. There are no locks placed on rows of data by simple reads.
A writer of data does not block a reader of data. Let me repeat: reads are not blocked by writes. This is fundamentally different from many other databases, where reads are blocked by writes. While this sounds like an extremely positive attribute (and it generally is), if you do not understand this thoroughly and you attempt to enforce integrity constraints in your application via application logic, you are most likely doing it incorrectly.
A writer of data is blocked only when another writer of data has already locked the row it was going after. A reader of data never blocks a writer of data.

You must take these facts into consideration when developing your application, and you must also realize that this policy is unique to Oracle; every database has subtle differences in its approach to locking. Even if you go with lowest common denominator SQL in your applications, the locking and concurrency control models employed by each vendor assure something will be different. A developer who does not understand how their database handles concurrency will certainly encounter data integrity issues. (This is particularly common when a developer moves from another database to Oracle, or vice versa, and neglects to take the differing concurrency mechanisms into account in the application.)

Preventing Lost Updates

One of the side effects of Oracle’s nonblocking approach is that if you actually want to ensure that no more than one user has access to a row at once, then you, the developer, need to do a little work yourself.

A developer was demonstrating to me a resource scheduling program (for conference rooms, projectors, etc.) that he had just developed and was in the process of deploying. The application implemented a business rule to prevent the allocation of a resource to more than one person for any given period of time. That is, the application contained code that specifically checked that no other user had previously allocated the time slot (at least the developer thought it did). This code queried the SCHEDULES table and, if no rows existed that overlapped that time slot, inserted the new row. So, the developer was basically concerned with two tables:

create table resources

( resource_name varchar2(25) primary key,

other_data varchar2(25) );

create table schedules

( resource_name varchar2(25) references resources,

start_time date,

end_time date );

And, right after inserting a room reservation into SCHEDULES, and before committing, the application would query:

select count(*)

from schedules

where resource_name = :resource_name

and (start_time < :new_end_time)

and (end_time > :new_start_time);

It looked simple and bulletproof (to the developer anyway); if the count came back as one, the room was yours. If it came back greater than one, you could not reserve it for that period. Once I knew what his logic was, I set up a very simple test to show him the error that would occur when the application went live—an error that would be incredibly hard to track down and diagnose after the fact. You’d be convinced it must be a database bug.

All I did was get someone else to use the terminal next to him. Both navigated to the same screen, and, on the count of three, each hit the Go button and tried to reserve the same room for an overlapping time. Both got the reservation. The logic, which worked perfectly in isolation, failed in a multiuser environment. The problem in this case was caused in part by Oracle’s nonblocking reads. Neither session ever blocked the other session. Both sessions simply ran the query and then performed the logic to schedule the room. They could both run the query to look for a reservation, even if the other session had already started to modify the SCHEDULES table (the change wouldn’t be visible to the other session until commit, by which time it was too late). Since it would appear to each user they were never attempting to modify the same row in the SCHEDULES table, they would never block each other, and, thus, the business rule could not enforce what it was intended to enforce.

This surprised the developer—a developer who had written many database applications—because his background was in a database that employed read locks. That is, a reader of data would be blocked by a writer of data, and a writer of data would be blocked by a concurrent read of that data. In his world, one of those transactions would have blocked the other—or perhaps the application would have deadlocked. But the transaction would ultimately fail.

So, the developer needed a method of enforcing the business rule in a multiuser environment—a way to ensure that exactly one person at a time made a reservation on a given resource. In this case, the solution was to impose a little serialization of his own. In addition to performing the preceding count(*), the developer first performed the following:

select * from resources where resource_name = :resource_name FOR UPDATE;

What he did here was to lock the resource (the room) to be scheduled immediately before scheduling it, in other words before querying the SCHEDULES table for that resource. By locking the resource he is trying to schedule, the developer ensures that no one else is modifying the schedule for this resource simultaneously. Everyone wanting to execute that SELECT FOR UPDATE for the same resource must wait until the transaction commits, at which point they are able to see the schedule. The chance of overlapping schedules is removed.

Developers must understand that, in a multiuser environment, they must at times employ techniques similar to those used in multithreaded programming. The FOR UPDATE clause is working like a semaphore in this case. It serializes access to the RESOURCES tables for that particular row—ensuring no two people can schedule it simultaneously.

Using the FOR UPDATE approach is still highly concurrent as there are potentially thousands of resources to be reserved. What we have done is ensure that only one person modifies a resource at any time. This is a rare case where the manual locking of data we are not going to actually update is called for. You need to be able to recognize where you must manually lock and, perhaps as importantly, when not to (I’ll get to an example of this in a bit). Furthermore, the FOR UPDATE clause does not lock the resource from other people reading the data as it might in other databases. Hence, the approach will scale very well.

Issues such as the ones I’ve described in this section have massive implications when you’re attempting to port an application from database to database (I will return to this theme a little later in the chapter), and this trips people up time and time again. For example, if you are experienced in other databases where writers block readers and vice versa, you may have grown reliant on that fact to protect you from data integrity issues. The lack of concurrency is one way to protect yourself from this. That’s how it works in many non-Oracle databases. In Oracle, concurrency rules supreme and you must be aware that, as a result, things will happen differently (or suffer the consequences).

I have been in design sessions where the developers, even after being shown this sort of example, scoffed at the idea they would have to actually understand how it all works. Their response was, “We just check the ‘transactional’ box in our Hibernate application and it takes care of all transactional things for us. We don’t have to know this stuff.” I said to them, “So Hibernate will generate different code for SQL Server and DB2 and Oracle, entirely different code, different amounts of SQL statements, different logic?” They said no, but it will be transactional. This misses the point. Transactional in this context simply means that you support commit and rollback, not that your code is transactionally consistent (read that as “not that your code is correct”). Regardless of the tool or framework you are using to access the database, knowledge of concurrency controls is vital if you want to not corrupt your data.

Ninety-nine percent of the time, locking is totally transparent, and you need not concern yourself with it. It’s that other one percent you must be trained to recognize. There is no simple checklist of “if you do this, you need to do this” for this issue. Successful concurrency control is a matter of understanding how your application will behave in a multiuser environment and how it will behave in your database.

When we get to the chapters on locking and concurrency control, we’ll delve into this topic in much more depth. There you’ll learn that integrity constraint enforcement of the type presented in this section, where you must enforce a rule that crosses multiple rows in a single table or is between two or more tables (like a referential integrity constraint), are cases where you must always pay special attention and will most likely have to resort to manual locking or some other technique to ensure integrity in a multiuser environment.

Multiversioning

This is a topic very closely related to concurrency control as it forms the foundation for Oracle’s concurrency control mechanism. Oracle operates a multiversion, read-consistent concurrency model. In Chapter 7, we’ll cover the technical aspects in more detail, but, essentially, it is the mechanism by which Oracle provides for:

Read-consistent queries: Queries that produce consistent results with respect to a point in time.
Nonblocking queries: Queries are never blocked by writers of data, as they are in other databases.

These are two very important concepts in the Oracle database. The term multiversioning basically describes Oracle’s ability to simultaneously maintain multiple versions of the data in the database. The term read consistency reflects the fact that a query in Oracle will return results from a consistent point in time. Every block used by a query will be “as of” the same exact point in time—even if it was modified or locked while you performed your query. If you understand how multiversioning and read consistency work together, you will always understand the answers you get from the database. Before we explore in a little more detail how Oracle does this, here is the simplest way I know to demonstrate multiversioning in Oracle:

$ sqlplus eoda/foo@PDB1

SQL> drop table t purge;

SQL> create table t as select username, created from all_users;

Table created.

SQL> set autoprint off

SQL> variable x refcursor;

SQL> begin

open :x for select * from t;

end;

PL/SQL procedure successfully completed.

SQL> declare

pragma autonomous_transaction;

-- you could do this in another

-- sqlplus session as well, the

-- effect would be identical

begin

delete from t;

commit;

end;

PL/SQL procedure successfully completed.

SQL> print x

USERNAME CREATED

------------------------- --------------------

SYS 10-NOV-20

AUDSYS 10-NOV-20

SYSTEM 10-NOV-20

SYSBACKUP 10-NOV-20

SYSDG 10-NOV-20

SYSKM 10-NOV-20

SYSRAC 10-NOV-20

OUTLN 10-NOV-20

REMOTE_SCHEDULER_AGENT 10-NOV-20

GSMADMIN_INTERNAL 10-NOV-20

Table 1-1 summarizes the timing and operations of this multiversioning example.

Table 1-1

Demonstrating Multiversioning in Oracle

Time	Operation
Time 1	Create table T and populate it with data.
Time 2	Open cursor X for select from table T.
Time 3	Delete all rows from table T and commit.
Time 4	Select from table T showing no rows are in the table.
Time 5	Fetch data from cursor X showing rows as they existed in table T at time 2.

In this example, we created a test table, T, and loaded it with some data from the ALL_USERS table. We opened a cursor on that table. We fetched no data from that cursor: we just opened it and have kept it open.

Note

Bear in mind that Oracle does not “pre-answer” the query. It does not copy the data anywhere when you open a cursor—imagine how long it would take to open a cursor on a one-billion-row table if it did. The cursor opens instantly, and it answers the query as it goes along. In other words, the cursor just reads data from the table as you fetch from it.

In the same session (or maybe another session would do this; it would work as well), we proceed to delete all data from the table. We even go as far as to COMMIT work on that delete action. The rows are gone—but are they? In fact, they are retrievable via the cursor (or via a FLASHBACK query using the AS OF clause). The fact is that the resultset returned to us by the OPEN command was preordained at the point in time we opened it. We had touched not a single block of data in that table during the open, but the answer was already fixed in stone. We have no way of knowing what the answer will be until we fetch the data; however, the result is immutable from our cursor’s perspective. It is not that Oracle copied all of the preceding data to some other location when we opened the cursor; it was actually the DELETE command that preserved our data for us by placing it (the before image copies of rows as they existed before the DELETE) into a data area called an undo or rollback segment .

Flashback

In the past, Oracle always decided the point in time at which our queries would be consistent. That is, Oracle made it such that any resultset we opened would be current with respect to one of two points in time:

The point in time the query was opened: This is the default behavior in READ COMMITTED isolation (we’ll be covering the differences between READ COMMITTED, READ ONLY, and SERIALIZABLE transaction levels in Chapter 7).
The point in time the transaction that the query is part of began: This is the default behavior in READ ONLY and SERIALIZABLE transaction levels.

However, with Oracle’s flashback query feature, we can tell Oracle to execute a query “as of” (with certain reasonable limitations on the length of time you can go back into the past, of course). With this, you can “see” read consistency and multiversioning even more directly.

Note

The flashback data archive, used for long-term flashback queries (months or years into the past), does not use read consistency and multiversioning to produce the version of data that was in the database at some prior point in time. Instead, it uses before image copies of the records it has placed into the archive. We’ll come back to the flashback data archive in a later chapter. Note also that the flashback data archive is a feature of the database available for use without additional license cost.

Consider the following example. We start by getting an SCN (System Change or System Commit Number; the terms are interchangeable). This SCN is Oracle’s internal clock: every time a commit occurs, this clock ticks upward (increments). We could use a date or timestamp as well, but here the SCN is readily available and very precise:

$ sqlplus scott/tiger@PDB1

SQL> variable scn number

SQL> exec :scn := dbms_flashback.get_system_change_number;

PL/SQL procedure successfully completed.

SQL> print scn

SCN

----------

13646156

Note

The DBMS_FLASHBACK package might have restricted access on your system. You may have to grant execute on this package to the schema you’re using before you can access it.

We retrieved the SCN so we can tell Oracle the point in time we’d like to query “as of”; we could also use a date or timestamp in place of an SCN. We want to be able to query Oracle later and see what was in this table at this precise moment in time. First, let’s see what is in the EMP table right now:

SQL> select count(*) from emp;

COUNT(*)

----------

Now let’s delete all of this information and verify that it’s “gone”:

SQL> delete from emp;

14 rows deleted.

SQL> select count(*) from emp;

COUNT(*)

----------

SQL> commit;

Commit complete.

However, using the flashback query, with either the AS OF SCN or AS OF TIMESTAMP clause, we can ask Oracle to reveal to us what was in the table as of that point in time:

SQL> select count(*),

:scn then_scn,

dbms_flashback.get_system_change_number now_scn

from emp as of scn :scn;

COUNT(*) THEN_SCN NOW_SCN

---------- ---------- ----------

14 13646156 13646157

With Oracle, you have a command called “flashback” that uses this underlying multiversioning technology to allow you to return objects to the state they were at some prior point in time. In this case, we can put EMP back the way it was before we deleted all of the information (as part of doing this, we’ll need to enable row movement, which allows the rowid assigned to the row to change—a necessary prerequisite for flashing back a table):

SQL> alter table emp enable row movement;

Table altered.

SQL> flashback table emp to scn :scn;

Flashback complete.

SQL> select cnt_now, cnt_then,

:scn then_scn,

dbms_flashback.get_system_change_number now_scn

from (select count(*) cnt_now from emp),

(select count(*) cnt_then from emp as of scn :scn);

CNT_NOW CNT_THEN THEN_SCN NOW_SCN

---------- ---------- ---------- ----------

14 14 13646156 13646786

This is what read consistency and multiversioning are all about. If you don’t understand how Oracle’s multiversioning scheme works and what it implies, you won’t be able to take full advantage of Oracle or write correct applications in Oracle (ones that will ensure data integrity).

Note

A flashback table requires the Enterprise Edition of Oracle.

Read Consistency and Nonblocking Reads

Let’s look at the implications of multiversioning: read-consistent queries and nonblocking reads. If you are not familiar with multiversioning, what you see in the following code might be surprising. For the sake of simplicity, assume the table we are reading stores one row per database block (the smallest unit of storage in the database) and that we are full scanning the table in this example.

The table we will query is a simple ACCOUNTS table. It holds balances in accounts for a bank. It has a very simple structure:

SQL> create table accounts

( account_number number primary key,

account_balance number);

In reality, the ACCOUNTS table would have hundreds of thousands of rows in it, but for simplicity we’re just going to consider a table with four rows, as shown in Table 1-2. (We will visit this example in more detail in Chapter 7.)

Table 1-2

Accounts Table Contents

Row	Account Number	Account Balance
1	123	$500.00
2	234	$250.00
3	345	$400.00
4	456	$100.00

We would like to run an end-of-day report that tells us how much money is in the bank. That’s an extremely simple query:

SQL> select sum(account_balance) from accounts;

And, of course, in this example, the answer is obvious: $1250. However, what happens if we read row 1, and while we’re reading rows 2 and 3, an automated teller machine (ATM) generates transactions against this table and moves $400 from account 123 to account 456? Our query counts $500 in row 4 and comes up with the answer of $1650, doesn’t it? Well, of course, this is to be avoided, as it would be an error—at no time did this sum of money exist in the account balance column. Read consistency is the way Oracle avoids such occurrences. Oracle’s methods differ from those of most other databases, and you need to understand how.

In many other databases, if you wanted to get a “consistent” and “correct” answer to this query, you’d either have to lock the whole table while the sum was calculated or you’d have to lock the rows as you read them. This prevents people from changing the answer as you are getting it. If you lock the table up front, you get the answer that was in the database at the time the query began. If you lock the data as you read it (commonly referred to as a shared read lock, which prevents updates, but not other readers from accessing the data), you get the answer that was in the database at the point the query finished. Both of these methods inhibit concurrency a great deal. The table lock prevents any updates from taking place against the entire table for the duration of your query (for a table of four rows, this would only be a very short period, but for tables with hundreds of thousands of rows, it could be several minutes). The “lock as you go” method prevents updates on data you have read and already processed and could actually cause deadlocks between your query and other updates.

Now, I said earlier that you wouldn’t be able to take full advantage of Oracle if you didn’t understand the concept of multiversioning. Here is one reason why that is true. Oracle uses multiversioning to get the answer, as it existed at the point in time the query began, and the query will take place without locking a single thing (while our account transfer transaction updates rows 1 and 4, these rows will be locked to other writers, but not locked to other readers, such as our SELECT SUM... query). In fact, Oracle doesn’t have a “shared read” lock (a type of lock common in other databases)—it doesn’t need it. Everything inhibiting concurrency that can be removed has been removed.

I have seen actual cases where a report written by a developer who did not understand Oracle’s multiversioning capabilities would lock an entire system up as tight as could be. The reason: the developer wanted to have read-consistent (i.e., correct) results from his queries. In every other database the developer had worked with, this required locking the tables or using a SELECT ... WITH HOLDLOCK (a SQL Server mechanism for locking rows in a shared mode as you go along). So the developer would either lock the tables prior to running the report or use SELECT ... FOR UPDATE (the closest he could find to WITH HOLDLOCK). This would cause the system to basically stop processing transactions—needlessly.

So, how does Oracle get the correct, consistent answer ($1250) during a read without locking any data—in other words, without decreasing concurrency? The secret lies in the transactional mechanisms that Oracle uses. Whenever you modify data, Oracle creates entries in two different locations (most other databases would put both entries in the same location; for them undo and redo are just “transaction data”). One entry goes to the redo logs where Oracle stores enough information to redo or “roll forward” the transaction. For an insert, this would be the row inserted. For a delete, it is conceptually a message to delete the row in file X, block Y, row slot Z. And so on. The other entry is an undo entry, written to an undo segment. If your transaction fails and needs to be undone, Oracle will read the “before” image from the undo segment and restore the data. In addition to using this undo segment data to undo transactions, Oracle uses it to undo changes to blocks as it is reading them—to restore the block to the point in time your query began. This gives you the ability to read right through a lock and to get consistent, correct answers without locking any data yourself.

So, as far as our example is concerned, Oracle arrives at its answer as shown in Table 1-3.

Table 1-3

Multiversioning in Action

Time	Query	Account Transfer Transaction
T1	Reads row 1; balance = $500; sum = $500 so far.
T2		Updates row 1; puts an exclusive lock on row 1, preventing other updates (but not reads). Row 1 now has $100.
T3	Reads row 2; balance = $250; sum = $750 so far.
T4	Reads row 3; balance = $400; sum = $1150 so far.
T5		Updates row 4; puts an exclusive lock on row 4, preventing other updates (but not reads). Row 4 now has $500.
T6	Reads row 4; discovers that row 4 has been modified. It will actually roll back the block to make it appear as it did at time = T1. The query will read the value $100 from this block.
T7		Commits transaction.
T8	Presents $1250 as the answer.

At time T6, Oracle is effectively “reading through” the lock that our transaction placed on row 4. This is how nonblocking reads are implemented: Oracle only looks to see if the data changed; it doesn’t care if the data is currently locked (which implies that the data may have changed). Oracle simply retrieves the old value from the undo segment and proceeds to the next block of data.

This is another clear demonstration of multiversioning. Multiple versions of the same piece of information, all at different points in time, are available in the database. Oracle is able to use these snapshots of data at different points in time to provide us with read-consistent queries and nonblocking reads.

This read-consistent view of data is always performed at the SQL statement level. The results of any single SQL statement are consistent with respect to the point in time they began. This quality is what makes a statement like the following insert a predictable set of data:

for x in (select * from t)

loop

insert into t values (x.username, x.user_id, x.created);

end loop;

The result of the SELECT * FROM T is preordained when the query begins execution. The SELECT will not see any of the new data generated by the INSERT. Imagine if it did—this statement might be a never-ending loop. If, as the INSERT generated more rows in T, the SELECT could “see” those newly inserted rows, the preceding code would create some unknown number of rows. If the table T started out with 10 rows, we might end up with 20, 21, 23, or an infinite number of rows in T when we finished. It would be totally unpredictable. This consistent read is provided to all statements so that an INSERT such as the following is predictable as well:

insert into t select * from t;

The INSERT statement will be provided a read-consistent view of T. It will not see the rows that it just inserted; rather, it will only insert the rows that existed at the time the SELECT began. Some databases won’t even permit recursive statements such as the preceding because they can’t tell how many rows might actually be inserted.

So, if you are used to the way other databases work with respect to query consistency and concurrency, or you never had to grapple with such concepts (i.e., you have no real database experience), you can now see how understanding how this works will be important to you. In order to maximize Oracle’s potential, and to implement correct code, you need to understand these issues as they pertain to Oracle—not how they are implemented in other databases.

Database Independence?

By now, you might be able to see where I’m going in this section. I have made references to other databases and how features are implemented differently in each. With the exception of some read-only applications, it is my contention that building a wholly database-independent application that is highly scalable is extremely hard—it is, in fact, quite impossible unless you know exactly how each database works in great detail. And, if you knew how each database worked in great detail, you’d understand that database independence is not something you really want to achieve (a very circular argument!).

To illustrate, let’s revisit our initial resource scheduler example (prior to adding the FOR UPDATE clause). Let’s say this application had been developed on a database with an entirely different locking/concurrency model from that of Oracle. What I’ll show here is that if you migrate your application from one database to another, you’ll have to verify that it still works correctly in these different environments and substantially change it as you do!

Let’s assume that we had deployed the initial resource scheduler application in a database that employed blocking reads (reads are blocked by writes). Also consider that the business rule was implemented via a database trigger (after the INSERT had occurred but before the transaction committed, we would verify that only our row existed in the table for that time slot). In a blocking read system, due to this newly inserted data, it would be true that insertions into this table would serialize. The first person would insert their request for “room A” from 2:00 p.m. to 3:00 p.m. on Friday and then run a query looking for overlaps. The next person would try to insert an overlapping request and, upon looking for overlaps, would become blocked (waiting for the newly inserted data to become available for reading). In that blocking read database, our application would be apparently well behaved, though it could just as easily deadlock (a concept covered in Chapter 6 on locking) if we both inserted our rows and then attempted to read each other’s data. Our checks on overlapping resource allocations would have happened one after the other, never concurrently.

If we migrated this application to Oracle and simply assumed it would behave in the same way, we would be in for a shock. On Oracle, which does row-level locking and supplies nonblocking reads, it appears to be ill-behaved. As we saw previously, we had to use the FOR UPDATE clause to serialize access. Without this clause, two users could schedule the same resource for the same times. This is a direct consequence of not understanding how the database we have works in a multiuser environment.

I have encountered issues such as this many times when an application is being moved from database A to database B. When an application that worked flawlessly in database A does not work or works in an apparently bizarre fashion on database B, the first thought is that database B is a “bad database.” The simple truth is that database B just works differently. Neither database is wrong or bad; they are just different. Knowing and understanding how they work will help you immensely in dealing with these issues. Taking an application from Oracle to SQL Server exposes SQL Server’s blocking reads and deadlock issues—it goes both ways.

For example, I was asked to help convert some Transact-SQL (the stored procedure language for SQL Server) into PL/SQL. The developer doing the conversion was complaining that the SQL queries in Oracle returned the “wrong” answer. The queries looked like this:

declare

l_some_variable varchar2(25);

begin

if ( some_condition )

then

l_some_variable := f( ... );

end if;

for x in ( select * from T where x = l_some_variable )

loop

...

The goal here was to find all of the rows in T where x was NULL if some condition was not met or where x equaled a specific value if some condition was met.

The complaint was that, in Oracle, this query would return no data when L_SOME_VARIABLE was not set to a specific value (when it was left as NULL). In Sybase or SQL Server, this was not the case—the query would find the rows where x was set to a NULL value. I see this on almost every conversion from Sybase or SQL Server to Oracle. SQL is supposed to operate under tri-valued logic, and Oracle implements NULL comparisons the way ANSI SQL requires them to be implemented (where NULL signifies a state of unknown and not a value). Under those rules, comparing x to a NULL is neither true nor false—it is, in fact, unknown. The following snippet shows what I mean:

SQL> select * from dual where null=null;

no rows selected

SQL> select * from dual where null <> null;

no rows selected

SQL> select * from dual where null is null;

This can be confusing the first time you see it. It proves that, in Oracle, NULL is neither equal to nor not equal to NULL. SQL Server, by default, does not do it that way: in SQL Server and Sybase, NULL is equal to NULL (by default; in current releases of SQL Server, the default behavior may be modified to reflect the ANSI standard). None of the databases’ processing is wrong—it is just different. And all of the databases are, in fact, ANSI compliant (ANSI compliance does not mean you support 100 percent of the standard, not by a long shot; see the next section “The Impact of Standards” for details), but they still work differently. There are ambiguities, backward compatibility issues, and so on, to be overcome. For example, SQL Server supports the ANSI method of NULL comparison, just not by default (if it did, it would break thousands of existing legacy applications built on that database).

In this case, one solution to the problem is to write the query like this instead:

select *

from t

where ( x = l_some_variable OR (x is null and l_some_variable is NULL ))

However, this leads to another problem. In SQL Server, this query would use an index on x. This might not be the case in Oracle since a B*Tree index (more on indexing techniques in the Chapter 11 on indexes) will not index an entirely NULL entry. Hence, if you need to find NULL values, B*Tree indexes are not always useful.

Note

As long as at least one column of an Oracle B*Tree index is defined as NOT NULL, all rows in the table will, in fact, appear in the index, and the predicate where x is null can and will use an index to retrieve the rows.

What we did in this case, in order to minimize impact on the code, was to assign x some value that it could never in reality assume. Here, x, by definition, was a positive number, so we chose the number –1. Thus, the query became

select * from t where nvl(x,-1) = nvl(l_some_variable,-1)

And we created a function-based index:

create index t_idx on t( nvl(x,-1) );

With minimal change, we achieved the same end result. The following are the important points to recognize from this example:

Databases are different. Experience with one will, in part, carry over to another, but you must be ready for some fundamental differences as well as some very minor differences.
Minor differences (such as treatment of NULLs) can have as big an impact as fundamental differences (such as concurrency control mechanisms).
Being aware of the database, how it works, and how its features are implemented is the only way to overcome these issues.

Developers frequently ask me (usually more than once a day) how to do something specific in the database, such as, “How do I create a temporary table in a stored procedure?” I don’t answer such questions directly. Instead, I respond with a question: “Why do you want to do that?” Many times, the answer that comes back is: “In SQL Server we created temporary tables in our stored procedures and we need to do this in Oracle.” That’s what I expected to hear. My response, then, is easy: “You don’t want to create temporary tables in a stored procedure in Oracle—you only think you do.” That would, in fact, be a very bad thing to do in Oracle. If you created the tables in a stored procedure in Oracle, you would find that

Doing DDL is a scalability inhibitor.
Doing DDL constantly is not fast.
Doing DDL commits your transaction.
You would have to use Dynamic SQL in all of your stored procedures in order to access this table—no static SQL (because the table wouldn’t exist at compile time).
Dynamic SQL in PL/SQL is not as fast or as optimized as static SQL.

The bottom line is that you don’t want to do it exactly as you did it in SQL Server (if you even need the temporary table in Oracle at all). You want to do things as they are best done in Oracle. Just as if you were going the other way from Oracle to SQL Server, you would not want to create a single table for all users to share for temporary data (that is how Oracle does it). That would limit scalability and concurrency in those other databases. All databases are not created equal; they are all very different.

This is not to say that you can’t use temporary tables in Oracle. You can, you probably will. It is just that you will use them differently in Oracle than you did in SQL Server (and vice versa).

The Impact of Standards

If all databases are SQL99 compliant, then they must be the same. At least that’s often the assumption. In this section, I’d like to dispel that myth.

SQL99 is an ANSI/ISO standard for databases. It was the successor to the SQL92 ANSI/ISO standard, which in turn superseded the SQL89 ANSI/ISO standard. It has now been superseded itself by the SQL2003, SQL2008, and SQL2011 standard updates. The standard defines a language (SQL) and behavior (transactions, isolation levels, and so on) that tell you how a database will behave. Did you know that many commercially available databases are SQL99 compliant to at least some degree? Did you also know that it means very little as far as query and application portability goes?

Starting with the SQL92 standard, the standards have four levels:

Entry level: This is the level to which most vendors have complied. It is a minor enhancement of the predecessor standard, SQL89. No database vendors have been certified higher, and, in fact, the National Institute of Standards and Technology (NIST), the agency that used to certify for SQL compliance, does not even certify anymore. I was part of the team that got Oracle 7.0 NIST-certified for SQL92 entry-level compliance in 1993. An entry level–compliant database has a feature set that is a subset of Oracle 7.0’s capabilities.
Transitional: This level is approximately halfway between the entry level and the intermediate level as far as a feature set goes.
Intermediate: This level adds many features including (this is not by any means an exhaustive list)
- Dynamic SQL
- Cascade DELETE for referential integrity
- DATE and TIME datatypes
- Domains
- Variable-length character strings
- A CASE expression
- CAST functions between datatypes
Full: This level adds provisions for (again, this list is not exhaustive)
- Connection management
- A BIT string datatype
- Deferrable integrity constraints
- Derived tables in the FROM clause
- Subqueries in CHECK constraint clauses
- Temporary tables

The entry-level standard does not include features such as outer joins, the new inner join syntax, and so on. Transitional does specify outer join syntax and inner join syntax. Intermediate adds more, and full is, of course, all of SQL92. Most books on SQL92 do not differentiate between the various levels, which leads to confusion on the subject. They demonstrate what a theoretical database implementing SQL92 full would look like. It makes it impossible to pick up a SQL92 book and apply what you see in the book to just any SQL92 database. The bottom line is that SQL92 will not go very far at the entry level, and, if you use any of the features of intermediate or higher, you risk not being able to port your application.

SQL99 defines only two levels of conformance: Core and Enhanced. It attempted to go far beyond traditional SQL and introduced object-relational constructs (arrays, collections, etc.). It covered a SQL MM (multimedia) type, object-relational types, and so on. No vendors are certifying databases to be SQL99 Core or Enhanced “compliant,” and, in fact, I know of no vendor who is even claiming their product is fully compliant with either level of conformance.

You should not be afraid to make use of vendor-specific features—after all, you are paying a lot of money for them. Every database has its own bag of tricks, and you can always find a way to perform a given operation in each database. Use what is best for your current database, and reimplement components as you go to other databases. Use good programming techniques to isolate yourself from these changes. The same techniques are employed by people writing OS-portable applications—such as the Oracle kernel developers.

Make Sure You Can Adapt

The goal is to fully use the facilities available to you, but ensure you can change the implementation on a case-by-case basis. As an analogy, Oracle is a portable application. It runs on many operating systems. On Windows, however, it runs the Windows way: using threads and other Windows-specific facilities. On UNIX/Linux, in contrast, Oracle runs as a multiprocess server, using individual processes to do what threads do on Windows—that’s the UNIX/Linux way. The “core Oracle” functionality is available on both platforms, but it is implemented in very different ways under the covers. Your database applications that must function on multiple databases will be the same.

For example, a common function of many database applications is the generation of a unique key for each row. When you insert the row, the system should automatically generate a key for you. Oracle has implemented the database object called a SEQUENCE for this. SYS_GUID() is another function that provides for unique keys as well. Informix has a SERIAL datatype. Sybase and SQL Server have an IDENTITY type. Each database has a way to do this. However, the methods are different, both in how you do it and the possible outcomes. So, to the knowledgeable developer, there are two paths that can be pursued:

Develop a totally database-independent method of generating a unique key.
Accommodate the different implementations and use different techniques when implementing keys in each database.
Note Oracle has an IDENTITY type column. Under the covers, it creates a sequence and defaults your column to that value—making it work very much like the SQL Server IDENTITY type or the MySQL AUTO_INCREMENT type.

The theoretical advantage of the first approach is that to move from database to database you need not change anything. I call it a “theoretical” advantage because the downside of this implementation is so huge that it makes this solution totally infeasible. What you’d have to do to develop a totally database-independent process is to create a table such as this:

$ sqlplus eoda/foo@PDB1

SQL> create table id_table

( id_name varchar2(30) primary key,

id_value number );

Table created.

SQL> insert into id_table values ( 'MY_KEY', 0 );

1 row created.

SQL> commit;

Commit complete.

Then, in order to get a new key, you’d have to execute the following code:

SQL> update id_table

set id_value = id_value+1

where id_name = 'MY_KEY';

1 row updated.

SQL> select id_value

from id_table

where id_name = 'MY_KEY';

ID_VALUE

----------

SQL> exit;

Looks simple enough, but the outcomes (notice plural) are as follows:

Only one user at a time may process a transaction row. You need to update that row to increment a counter, and this will cause your program to serialize on that operation. At best, one person at a time will generate a new value for this key.
In Oracle (and the behavior might be different in other databases), all but the first user to attempt to concurrently perform this operation would receive the error “ORA-08177: can’t serialize access for this transaction” in the SERIALIZABLE isolation level.

For example, using a serializable transaction (which is more common in the J2EE environment, where many tools automatically use this as the default mode of isolation, often unbeknownst to the developers), you would observe the following behavior:

$ sqlplus eoda/foo@PDB1

SQL> set transaction isolation level serializable;

Transaction set.

SQL> update id_table set id_value = id_value+1 where id_name = 'MY_KEY';

1 row updated.

SQL> select id_value

from id_table

where id_name = 'MY_KEY';

ID_VALUE

----------

Now (while leaving the prior session connected), we’ll go to another SQL*Plus session and perform the same operation, a concurrent request for a unique ID:

- - Session 2

SQL> set transaction isolation level serializable;

Transaction set.

SQL> update id_table

set id_value = id_value+1

where id_name = 'MY_KEY';

This will block at this point, as only one transaction at a time can update the row. This demonstrates the first possible outcome—we would block and wait for the row. But since we’re using SERIALIZABLE in Oracle, we’ll observe the following behavior as we commit the first session’s transaction:

-- Session 1

SQL> commit;

Commit complete.

The second session will immediately display the following error:

-- Session 2

SQL> update id_table

set id_value = id_value+1

where id_name = 'MY_KEY';

update id_table

ERROR at line 1:

ORA-08177: can't serialize access for this transaction

That error would occur regardless of the ordering of the preceding commit statement. All it takes is for your transaction to attempt to modify any record that was modified by some other session since your transaction began.

So, that database-independent piece of logic really isn’t database independent at all. It may not even perform reliably in a single database, depending on the isolation level! Sometimes we block and wait; sometimes we get an error message. To say the end user would be upset in either case (wait a long time or wait a long time to get an error) is putting it mildly.

This issue is compounded by the fact that our transaction is much larger than just outlined. The UPDATE and SELECT in the example are only two statements of potentially many other statements that make up the transaction. We have yet to insert the row into the table with this key we just generated, and do whatever other work it takes to complete this transaction. This serialization will be a huge limiting factor in scaling. Think of the ramifications if this technique was used on websites that processed orders, and this was how we generated order numbers. There would be no multiuser concurrency, so we would be forced to do everything sequentially.

The correct approach to this problem is to use the best code for each database. In Oracle, there are two approaches shown as follows (assuming the table that needs the generated primary key is T):

$ sqlplus eoda/foo@PDB1

SQL> create sequence s;

Sequence created.

SQL> create table t

( x number default s.nextval

constraint t_pk primary key,

other_data varchar2(20));

Table created.

Alternatively, you could use the IDENTITY attribute and skip the sequence generation:

SQL> create table t

( x number generated as identity

constraint t_pk primary key,

other_data varchar2(20));

Table created.

Note

An IDENTITY column must be of a numeric datatype.

But note that it did not really skip the sequence generation—the sequence was generated automatically by the database. If you didn’t have an automatically generated sequence, then you could use the following logic to have an autogenerated surrogate primary key:

SQL> create table t

( pk number primary key,

other_data varchar2(20));

Table created.

SQL> create sequence t_seq;

Sequence created.

SQL> create trigger t before insert on t

for each row

begin

:new.pk := t_seq.nextval;

end;

Trigger created.

This will have the effect of automatically—and transparently—assigning a unique key to each row inserted. A more performance-driven approach would be simply this:

Insert into t ( pk, .... ) values ( t_seq.NEXTVAL, .... );

That is, skip the overhead of the trigger altogether (this is definitely my preferred approach). You can achieve the same effect in the other databases using their types. The CREATE TABLE syntax will be different, but the net results will be the same. Here, we’ve gone out of our way to use each database’s feature to generate a nonblocking , highly concurrent unique key and have introduced no real changes to the application code—all of the logic is contained in this case in the DDL.

Layered Programming

Once you understand that each database will implement features in a different way, another example of defensive programming to allow for portability is to layer your access to the database when necessary. Let’s say you are programming using JDBC. If all you use is straight SQL SELECTs, INSERTs, UPDATEs, and DELETEs, you probably don’t need a layer of abstraction. You may very well be able to code the SQL directly in your application, as long as you limit the constructs you use to those supported by each of the databases you intend to support—and that you have verified work exactly the same (remember the NULL= NULL discussion!). This means you’ll have poorly performing SQL, though—and you’ll apparently have in your head more knowledge about more databases than most anyone I know of (after all, that’s the only way to know if something has a chance of working the same on all databases!). Another approach that is both more portable and offers better performance would be to use stored procedures to return resultsets. You will discover that every vendor’s database can return resultsets from stored procedures, but how they are returned is different. The actual stored procedure source code you must write is different for different databases.

Your two choices here are to either not use stored procedures to return resultsets or to implement different code for different databases. I would definitely follow the different-code-for-different-vendors method and use stored procedures heavily. This might seem as if it would increase the time it takes to implement on a different database. However, you’ll find it is actually easier to implement on multiple databases with this approach. Instead of having to find the perfect SQL that works on all databases (perhaps better on some than on others), you will implement the SQL that works best on that database. You can do this outside of the application itself, which gives you more flexibility in tuning the application. You can fix a poorly performing query in the database, and deploy that fix immediately, without having to patch the application. Additionally, you can take advantage of vendor extensions to SQL using this method freely. For example, Oracle supports a wide variety of SQL extensions, such as analytic functions, the SQL model clause, row pattern matching, and more. In Oracle, you are free to use these extensions to SQL since they are “outside” of the application (i.e., hidden in the database). In other databases, you would use whatever features they provide to achieve the same results, perhaps. You paid for these features so you might as well use them.

Another argument for this approach—developing specialized code for the database you will deploy on—is that finding a single developer (let alone a team of developers) who is savvy enough to understand the nuances of the differences between Oracle, SQL Server, and DB2 (let’s limit the discussion to three databases in this case) is virtually impossible. I’ve worked mostly with Oracle for the last 20 years (mostly, not exclusively). I learn something new about Oracle every single day I use it. To suggest that I could be expert in three databases simultaneously and understand what the differences between all three are and how those differences will affect the “generic code” layer I’d have to build is highly questionable. I doubt I would be able to do that accurately or efficiently. Also, consider that we are talking about individuals here; how many developers actually fully understand or use the database they currently have, let alone three of them? Searching for the unique individual who can develop bulletproof, scalable, database-independent routines is like searching for the holy grail. Building a team of developers that can do this is impossible. Finding an Oracle expert, a DB2 expert, and a SQL Server expert and telling them “We need a transaction to do X, Y, and Z”—that’s relatively easy. They are told, “Here are your inputs, these are the outputs we need, and this is what this business process entails,” and from this they can produce transactional APIs (stored procedures) that fit the bill. Each will be implemented in the manner best for that particular database, according to that database’s unique set of capabilities. These developers are free to use the full power (or lack thereof, as the case may be) of the underlying database platform.

These are the same techniques developers who implement multiplatform code use. Oracle, for example, uses this technique in the development of its own database. There is a large amount of code (though a small percentage of the database code overall) called OSD (Operating System Dependent) code that is implemented specifically for each platform. Using this layer of abstraction, Oracle is able to make use of many native OS features for performance and integration, without having to rewrite the majority of the database itself. The fact that Oracle can run as a multithreaded application on Windows and a multiprocess application on UNIX/Linux attests to this feature. The mechanisms for interprocess communication are abstracted to such a level that they can be reimplemented on an OS-by-OS basis, allowing for radically different implementations that perform as well as an application written directly, and specifically, for that platform.

In addition to SQL syntactic differences, implementation differences, and differences in performance of the same query in different databases outlined earlier, there are the issues of concurrency controls, isolation levels, query consistency, and so on. We cover these items in some detail in Chapter 7 of this book, and you’ll see how their differences may affect you. SQL92/SQL99 attempted to provide a straightforward definition of how a transaction should work and how isolation levels should be implemented, but in the end, you’ll get different results from different databases. It is all due to the implementation. In one database, an application will deadlock and block all over the place. In another database, the same exact application will run smoothly. In one database, the fact that you did block (physically serialize) was used to your advantage, but when you deploy on another database and it does not block, you get the wrong answer. Picking an application up and dropping it on another database takes a lot of hard work and effort, even if you followed the standard 100 percent.

Features and Functions

A natural extension of the argument that you shouldn’t necessarily strive for database independence is the idea that you should understand exactly what your specific database has to offer and make full use of it. This is not a section on all of the features that Oracle has to offer—that would be an extremely large book in itself. With over 10,000 pages of documentation provided by Oracle, covering every feature and function would be quite an undertaking. Rather, this section explores the benefits of gaining at least a cursory knowledge of what is provided.

As I’ve said before, I answer questions about Oracle on the Internet. I’d say that 80 percent of my answers are simply URLs to the documentation (for every question you see that I’ve published—many of which are pointers into the documentation—there are two more questions I choose not to publish, almost all of which are “read this” answers). People ask how they might go about writing some complex piece of functionality in the database (or outside of it), and I just point them to the place in the documentation that tells them how Oracle has already implemented the feature they need and how to use it. Replication comes up frequently. Here’s a typical example of what I am asked:

Is there a view that will show the literal SQL run? What I mean is that when I select from V$SQL, the SQL_TEXT looks like this: INSERT INTO TABLE1 (COL1,COL2) VALUES (:1,:2). I need to see the actual data submitted. e.g. INSERT INTO TABLE1 (COL1,COL2) VALUES ('FirstVal',12). What I am trying to get is a list of insert, update, or delete statements run against one schema and run those same SQL statements against a second schema in the same order of execution. I am hopeful to be able to write something like this:

Select SQL_FULLTEXT from V$SQL where FIRST_LOAD_TIME > SYSDATE-(1/24) AND ➥

(SQL_TEXT like ‘INSERT%’...) order by FIRST_LOAD_TIME

This record set would be sent via a web service to schema2, which would process the statements. Is this possible?

Here is someone trying to reinvent replication! They can’t get the literal SQL (and thank goodness for that!), but even if they could, this approach would never work. You can’t just take a concurrently executed set of SQL statements (what happens on a multi-CPU machine where two SQL statements are executed at exactly the same time?) and execute them serially (you may end up with different answers!). You’d need to replay them using the degree of concurrency you used on the originating system.

For example, if you and I both execute INSERT INTO A_TABLE SELECT * FROM A_TABLE; at about the same time, we’d end up with A_TABLE having three times as many rows as it did when we started. For example, if A_TABLE started with 100 rows and I did that insert, it would now have 200 rows. If you did the insert right after me (before I commit), you would not see my 200 rows and you’d insert 100 more rows into A_TABLE, which would end up with 300 rows. Now, if we change things so that a web service performs my insert (A_TABLE grows from 100 to 200 rows) and then your insert (A_TABLE grows from 200 to 400 rows)—you can see the problem here. Replication is not trivial, it is, in fact, quite difficult. Oracle (and other databases) has been doing replication for over three decades now; it takes a lot of effort to implement and maintain.

It's true you can write your own replication, and it might even be fun to do so, but at the end of the day, it’s not the smartest thing to do. The database does a lot of stuff. In general, it can do it better than we can ourselves. Replication, for example, is internalized in the kernel, written in C. It’s fast, it’s fairly easy, and it’s robust. It works across versions and across platforms. It is supported, so if you hit a problem, Oracle’s support team will be there to help. If you upgrade, replication will be supported there as well, probably with some new features. Now, consider if you were to develop your own. You’d have to provide support for all of the versions you wanted to support. Interoperability between old and new releases? That’d be your job. If it “broke,” you wouldn’t be calling support. At least, not until you could get a test case small enough to demonstrate your basic issue. When the new release of Oracle comes out, it would be up to you to migrate your replication code to that release.

Knowing What’s Out There

Not having a full understanding of what is available to you can come back to haunt you in the long run. I was working with some developers with years of experience developing database applications—on other databases. They built analysis software (trending, reporting, visualization software). It was to work on clinical data related to health care. They were not aware of SQL syntactical features like inline views, analytic functions, and scalar subqueries. Their major problem was they needed to analyze data from a single parent table to two child tables; an Entity Relation Diagram (ERD) might look like Figure 1-1.

../images/319655_4_En_1_Chapter/319655_4_En_1_Fig1_HTML.jpg — Figure 1-1
Simple ERD

The developers needed to be able to report on the parent record with aggregates from each of the child tables. The databases they worked with in the past did not support subquery factoring (WITH clause), nor did they support inline views—the ability to “query a query” instead of query a table. Not knowing these features existed, they wrote their own database of sorts in the middle tier. They would query the parent table and for each row returned run an aggregate query against each of the child tables. This resulted in their running thousands of tiny queries for each single query the end user wanted to run. Or, they would fetch the entire aggregated child tables into their middle tier into hash tables in memory—and do a hash join.

In short, they were reinventing the database, performing the functional equivalent of a nested loop join or a hash join, without the benefit of temporary tablespaces, sophisticated query optimizers, and the like. They were spending their time developing, designing, fine-tuning, and enhancing software that was trying to do the same thing the database they already bought did! Meanwhile, end users were asking for new features but not getting them, because the bulk of the development time was in this reporting “engine,” which really was a database engine in disguise.

I showed them that they could do things such as join two aggregations together in order to compare data that was stored at different levels of detail. Several approaches are possible, as illustrated in Listings 1-1 through 1-3.

select p.id, c1_sum1, c2_sum2

from p,

(select id, sum(q1) c1_sum1

from c1

group by id) c1,

(select id, sum(q2) c2_sum2

from c2

group by id) c2

where p.id = c1.id

and p.id = c2.id

Listing 1-1

Inline Views to Query from a Query

select p.id,

(select sum(q1) from c1 where c1.id = p.id) c1_sum1,

(select sum(q2) from c2 where c2.id = p.id) c2_sum2

from p

where p.name = '1234'

Listing 1-2

Scalar Subqueries That Run Another Query per Row

with c1_vw as

(select id, sum(q1) c1_sum1

from c1

group by id),

c2_vw as

(select id, sum(q2) c2_sum2

from c2

group by id),

c1_c2 as

(select c1.id, c1.c1_sum1, c2.c2_sum2

from c1_vw c1, c2_vw c2

where c1.id = c2.id )

select p.id, c1_sum1, c2_sum2

from p, c1_c2

where p.id = c1_c2.id

Listing 1-3

Subquery Factoring via the WITH Clause

In addition to what you see in these listings, we can also do great things using the analytic functions like LAG, LEAD, ROW_NUMBER, the ranking functions, and so much more. Rather than spending the rest of the day trying to figure out how to tune their middle tier database engine, we spent the day with the SQL Reference Guide projected on the screen (coupled with SQL*Plus to create ad hoc demonstrations of how things worked). The end goal was no longer tuning the middle tier; now it was turning off the middle tier as quickly as possible.

Here’s another example: I have seen people set up daemon processes in an Oracle database that read messages off of pipes (a database IPC mechanism implemented via DBMS_PIPE). These daemon processes execute the SQL contained within the pipe message and commit the work. They do this so they could execute auditing and error logging in a transaction that would not get rolled back if the bigger transaction did. Usually, if a trigger or something was used to audit an access to some data, but a statement failed later on, all of the work would be rolled back. So, by sending a message to another process, they could have a separate transaction do the work and commit it. The audit record would stay around, even if the parent transaction rolled back. In versions of Oracle before Oracle 8i, this was an appropriate (and pretty much the only) way to implement this functionality. When I told them of the database feature called autonomous transactions, they were quite upset with themselves. Autonomous transactions, implemented with a single line of code, do exactly what they were doing. On the bright side, this meant they could discard a lot of code and not have to maintain it. In addition, the system ran faster overall and was easier to understand. Still, they were upset at the amount of time they had wasted reinventing the wheel. In particular, the developer who wrote the daemon processes was quite upset at having just written a bunch of “shelfware.”

I see examples like these repeated time and time again—large complex solutions to problems that are already solved by the database itself. I’ve been guilty of this myself. I still remember the day when my Oracle sales consultant (I was the customer at the time) walked in and saw me surrounded by a ton of Oracle documentation. I looked up at him and just asked, “Is this all true?” I spent the next couple of days just digging and reading. I had fallen into the trap that I knew all about databases because I had worked with SQL Server, SQL/DS, DB2, Ingress, Sybase, Informix, SQLBase, Oracle, and others. Rather than take the time to see what each had to offer, I would just apply what I knew from the others to whatever I was working on. (Moving to Sybase/SQL Server was the biggest shock to me—it worked nothing like the others at all.) Upon actually discovering what Oracle could do (and the others, to be fair), I started taking advantage of it and was able to move faster, with less code.

Take the time to learn what is available. You miss so much by not doing that. I learn something new about Oracle pretty much every single day. It requires some keeping up with; I still read the documentation.

Solving Problems Simply

There are always two ways to solve everything: the easy way and the hard way. Time and time again, I see people choosing the hard way. It is not always done consciously. More often, it is done out of ignorance. They never expected the database to be able to do “that.” I, on the other hand, expect the database to be capable of anything and only do it the hard way (by writing it myself) when I discover it can’t do something.

For example, I am frequently asked, “How can I make sure the end user has only one session in the database?” (There are hundreds of other examples I could have used here.) This must be a requirement of many applications, but none I’ve ever worked on—I’ve not found a good reason for limiting people in this way. However, people want to do it, and when they do, they usually do it the hard way. For example, they will have a batch job run by the operating system that will look at the V$SESSION table and arbitrarily kill sessions of users who have more than one session. Alternatively, they will create their own tables and have the application insert a row when a user logs in and remove the row when they log out. This implementation invariably leads to lots of calls to the help desk because when the application crashes, the row never gets removed. I’ve seen lots of other “creative” ways to do this, but none is as easy as this:

$ sqlplus eoda/foo@PDB1

SQL> create profile one_session limit sessions_per_user 1;

Profile created.

SQL> alter user scott profile one_session;

User altered.

Now we’ll try to connect to SCOTT twice; the second attempt should fail:

SQL> connect scott/tiger@PDB1

Connected.

SQL> host sqlplus scott/tiger@PDB1

ERROR:

ORA-02391: exceeded simultaneous SESSIONS_PER_USER limit

That’s it—now any user with the ONE_SESSION profile can log on only once. When I bring up this solution, I can usually hear the smacking of a hand on the forehead followed by the statement “I never knew it could do that.” Taking the time to familiarize yourself with what the tools you have to work with are capable of doing can save you lots of time and energy in your development efforts.

The same “keep it simple” argument applies at the broader architecture level. I would urge people to think carefully before adopting very complex implementations. The more moving parts you have in your system, the more things you have that can go wrong, and tracking down exactly where that error is occurring in an overly complex architecture is not easy. It may be really “cool” to implement using umpteen tiers, but it’s not the right choice if a simple stored procedure can do it better, faster, and with less resources.

I’ve seen projects where application development has been going on for months, with no end in sight. The developers are using the latest and greatest technologies and languages, but development is not going very fast. It wasn’t that big of an application—and perhaps that was the problem. If you are building a doghouse (a small woodworking job), you wouldn’t bring in the heavy machinery. You’d use a few small power tools, but you wouldn’t have any use for the “big stuff.” On the other hand, if you were building an apartment complex, you’d have a cast of hundreds working on the project; you’d have the big machines—you’d use totally different tools to approach this problem. The same is true of application development. There is not a single “perfect architecture.” There is not a single “perfect language.” There is not one single “perfect approach.”

For example, to build my website, I used APEX (Application Express). My website is a smallish application; there was a single developer (or two) working on it. It has maybe 20 screens. PL/SQL and APEX were the correct choice for this implementation—it did not need a cast of dozens, coding in Java, making EJBs, using Hibernate, and so on. It was a simple problem, solved simply. There are a few complex, large-scale, huge applications (we buy most of those today: our HR systems, our ERP systems, and so on), but there are thousands of small applications. We need to use the proper approach and tools for the job.

I will always go with the simplest architecture that solves the problem completely over a complex one any day. The payback can be enormous. Every technology has its place. Not every problem is a nail, so we can use more than a hammer in our toolbox.

Openness

I frequently see people doing things the hard way for another reason, and again it relates to the idea that we should strive for openness and database independence at all costs. The developers wish to avoid using closed, proprietary database features—even those as simple as stored procedures or sequences—because doing so will lock them into a database system. Well, let me put forth the idea that the instant you develop a read/write application, you are already somewhat locked in. You will find subtle (and sometimes not so subtle) differences between the databases as soon as you start running queries and modifications. For example, in one database you might find that your SELECT COUNT(*) FROM T deadlocks with a simple update of two rows. In Oracle, you’ll find that the SELECT COUNT(*) never blocks on a writer of the data being counted. You’ve seen the case where a business rule appears to get enforced on one database, due to side effects of the database’s locking model, and does not get enforced in another database. You’ll find that, given the same exact transaction mix, reports come out with different answers in different databases, all because of fundamental implementation differences. You will find that it is a very rare application that can simply be picked up and moved from one database to another. Differences in the way the SQL is interpreted (for instance, the NULL=NULL example) and processed will always be there.

On one project, the developers were building a web-based product using Visual Basic, ActiveX controls, the IIS server, and the Oracle database. I was told that the development folks had expressed concern that since the business logic had been written in PL/SQL, the product had become database dependent, and was asked, “How can we correct this?”

I was a little taken aback by this question. In looking at the list of chosen technologies, I could not figure out how being database dependent was a “bad” thing:

The developers had chosen a language that locked them into a single operating system supplied by a single vendor (they could have opted for Java).
They had chosen a component technology that locked them into a single operating system and vendor (they could have opted for J2EE).
They had chosen a web server that locked them into a single vendor and single platform (why not something more open?).

Every other technology choice they had made locked them into a very specific configuration—in fact, the only technology that offered them any choice in terms of operating systems was the database.

Regardless of this (they must have had good reasons to choose the technologies they did), we still have a group of developers making a conscious decision to not use the functionality of a critical component in their architecture, and doing so in the name of openness. It is my belief that you pick your technologies carefully and then you exploit them to the fullest extent possible. You paid a lot for these technologies—isn’t it in your best interest to exploit them fully? I had to assume they were looking forward to using the full potential of the other technologies, so why was the database an exception? This was an even harder question to answer in light of the fact that it was crucial to their success.

We can put a slightly different spin on this argument if we consider it from the perspective of openness. You put all of your data into the database. The database is a very open tool. It supports data access via a large variety of open systems protocols and access mechanisms. Sounds great so far, the most open thing in the world.

Then, you put all of your application logic and, more importantly, your security outside of the database. Perhaps in your beans that access the data. Perhaps in the JSPs that access the data. Perhaps in your Visual Basic code. Perhaps in your Hibernate-generated code. The end result is that you have just closed off your database—you have made it “non-open.” No longer can people hook in existing technologies to make use of this data; they must use your access methods (or bypass security altogether). This sounds all well and good today, but what you must remember is that the whiz-bang technology of today is yesterday’s concept and tomorrow’s old, tired technology. What has persevered for over 30 years in the relational world (and probably most of the object implementations as well) is the database itself. The front ends to the data change almost yearly, and as they do, the applications that have all of the security built inside themselves, not in the database, become obstacles, roadblocks to future progress.

The Oracle database provides a feature called fine-grained access control (FGAC) . In a nutshell, this technology allows developers to embed procedures in the database that can modify queries as they are submitted to the database. This query modification is used to restrict the rows the client will receive or modify. The procedure can look at who is running the query, when they are running the query, what application is requesting the data, what terminal they are running the query from, and so on, and can constrain access to the data as appropriate. With FGAC, we can enforce security such that, for example

Any query executed outside of normal business hours by a certain class of users returns zero records.
Any data can be returned to a terminal in a secure facility, but only nonsensitive information can be returned to a remote client terminal.

Basically, FGAC allows us to locate access control in the database, right next to the data. It no longer matters if the user comes at the data from a bean, a JSP, a Visual Basic application using ODBC, or SQL*Plus—the same security protocols will be enforced. You are well situated for the next technology that comes along.

Now I ask you, which implementation is more “open?” The one that makes all access to the data possible only through calls to the Visual Basic code and ActiveX controls (replace Visual Basic with Java and ActiveX with EJB if you like—I’m not picking on a particular technology but an implementation here) or the solution that allows access from anything that can talk to the database, over protocols as diverse as SSL, HTTP, and Oracle Net (and others) or using APIs such as ODBC, JDBC, OCI, and so on? I have yet to see an ad hoc reporting tool that will “query” your Visual Basic code. I know of dozens that can do SQL, though.

The decision to strive for database independence and total openness is one that people are absolutely free to take, and many try, but I believe it is the wrong decision. No matter what database you are using, you should exploit it fully, squeezing every last bit of functionality you can out of that product. You’ll find yourself doing that in the tuning phase (which again always seems to happen right after deployment) anyway. It is amazing how quickly the database independence requirement can be dropped when you can make the application run five times faster just by exploiting the database software’s capabilities.

How Do I Make It Run Faster?

The question in the heading is one I get asked all the time. Everyone is looking for the fast = true switch, assuming “database tuning” means that you tune the database. In fact, it is my experience that more than 80 percent (frequently 100 percent) of all performance gains are to be realized at the application design and implementation level—not the database level. You can’t tune a database until you have tuned the applications that run on the database.

As time goes on, there are some switches we can throw at the database level to help lessen the impact of egregious programming blunders. For example, Oracle has a parameter, CURSOR_SHARING=FORCE. This feature implements an auto binder, if you will. It will silently take a query written as SELECT * FROM EMP WHERE EMPNO = 1234 and rewrite it for us as SELECT * FROM EMP WHERE EMPNO = :x. This can dramatically decrease the number of hard parses and decrease the library latch waits we discussed in the Architecture sections—but (there is always a but) it can have some side effects. A common side effect with cursor sharing is something like this:

SQL> select /* TAG */ substr( username, 1, 1 )

from all_users au1

where rownum = 1;

SQL> alter session set cursor_sharing=force;

Session altered.

SQL> select /* TAG */ substr( username, 1, 1 )

from all_users au2

where rownum = 1;

SUBSTR(USERNAME,1,1)

---------------------------------------------------------------------------

What happened there? Why is the column reported by SQL*Plus suddenly so large for the second query, which is arguably the same query? If we look at what the cursor sharing setting did for us, it (and something else) will become obvious:

SQL> select sql_text from v$sql where sql_text like 'select /* TAG */ %';

SQL_TEXT

---------------------------------------------------------------------------

select /* TAG */ substr( username, 1, 1 )

from all_users au1

where rownum = 1

select /* TAG */ substr( username, :"SYS_B_0", :"SYS_B_1" )

from all_users au2

where rownum = :"SYS_B_2"

The cursor sharing removed information from the query. It found every literal, including the substr constants we were using. It removed them from the query and replaced them with bind variables. The SQL engine no longer knows that the column is a substr of length 1—it is of indeterminate length. Also, you can see that where rownum = 1 is now bound as well. This seems like a good idea; however, the optimizer has just had some important information removed. It no longer knows that “this query will retrieve a single row”; it now believes “this query will return the first N rows and N could be any number at all.” This can have a negative impact on your generated query plans.

Additionally, I have shown that while CURSOR_SHARING=FORCE runs much faster than parsing and optimizing lots of unique queries (refer to the preceding section on bind variables), I have also found it to be slower than using queries where the developer did the binding. This arises not from any inefficiency in the cursor sharing code, but rather in inefficiencies in the program itself. In many cases, an application that does not use bind variables is not efficiently parsing and reusing cursors either. Since the application believes each query is unique (it built them as unique statements), it will never use a cursor more than once. The fact is that if the programmer had used bind variables in the first place, they could have parsed a query once and reused it many times. It is this overhead of parsing that decreases the overall potential performance.

Note

It is important to also point out that CURSOR_SHARING=FORCE will not fix SQL injection bugs. The binding comes after the query was rewritten by your end user; the SQL injection has already happened. CURSOR_SHARING=FORCE makes you no more secure than you were before. Only by using bind variables themselves can a developer implement a SQL injection–proof application.

Basically, it is important to keep in mind that simply turning on CURSOR_SHARING = FORCE will not necessarily fix your problems. It may very well introduce new ones. CURSOR_SHARING is, in some cases, a very useful tool, but it is not a silver bullet. A well-developed application would never need it. In the long term, using bind variables where appropriate, and constants when needed, is the correct approach.

Note

There are no silver bullets, none. If there were, they would be the default behavior and you would never hear about them.

Even if there are some switches that can be thrown at the database level, and they are truly few and far between, problems relating to concurrency issues and poorly executing queries (due to poorly written queries or poorly structured data) can’t be fixed with a switch. These situations require rewrites (and frequently a re-architecture). Moving datafiles around, adjusting parameters, and other database-level switches frequently have a minor impact on the overall performance of an application. Definitely not anywhere near the two, three, … n times increase in performance you need to achieve to make the application acceptable. How many times has your application been ten percent too slow? Ten percent too slow, no one complains about. Five times too slow, people get upset. I repeat: you will not get a five times increase in performance by moving datafiles around. You will only achieve large increments in performance by fixing the application, perhaps by making it do significantly less I/O.

Note

This is just to note how things change over time. I’ve often written that you will not get a five times increase in performance by moving datafiles around. With the advent of hardware solutions such as Oracle Exadata, you can, in fact, get a five times, ten times, fifty times, or more decrease in response time (the time it takes to return data) by simply moving datafiles around. But that is more of a “we completely changed our hardware architecture” story than a “we reorganized some of our storage.” Also, getting an application running only five or ten times faster on Exadata would be disappointing to me—I’d want it to be fifty times or more “faster”—and would require a rethinking of how the application is implemented.

Performance is something you have to design for, build to, and test for continuously throughout the development phase. It should never be something to be considered after the fact. I am amazed at how often people wait until the application has been shipped to the customer, put in place, and is actually running before they even start to tune it. I’ve seen implementations where applications are shipped with nothing more than primary keys—no other indexes whatsoever. The queries have never been tuned or stress-tested. The application has never been tried out with more than a handful of users. Tuning is considered to be part of the installation of the product. To me, that is an unacceptable approach. Your end users should be presented with a responsive, fully tuned system from day one. There will be enough “product issues” to deal with without having poor performance be the first thing users experience. Users expect a few bugs from a new application, but at least don’t make the users wait a painfully long time for those bugs to appear on screen.

The DBA-Developer Relationship

It’s certainly true that the most successful information systems are based on a symbiotic relationship between the DBA and the application developer. In this section, I just want to give a developer’s perspective on the division of work between the developer and the DBA (assuming that every serious development effort has a DBA team).

As a developer, you should not necessarily have to know how to install and configure the software. That should be the role of the DBA and perhaps the system administrator (SA). Setting up Oracle Net, getting the listener going, configuring the shared server, enabling connection pooling, installing the database, creating the database, and so on—these are functions I place in the hands of the DBA/SA.

In general, a developer should not have to know how to tune the operating system. I myself generally leave this task to the SAs for the system. As a software developer for database applications, you will need to be competent in the use of your operating system of choice, but you shouldn’t expect to have to tune it.

The single largest DBA responsibility is database restore and recovery. Note I did not say “backup.” I said “restore and recovery,” and I would say that this is the sole responsibility of the DBA. Understanding how rollback and redo work—yes, that is something a developer has to know. Knowing how to perform a tablespace point-in-time recovery is something a developer can skip over. Knowing that you can do it might come in handy, but actually having to do it—no.

Tuning at the database instance level and figuring out what the optimum PGA_AGGREGATE_TARGET should be—that’s typically the job of the DBAs (and the database is quite willing and able to assist them in determining the correct figure). There are exceptional cases where a developer might need to change some setting for a session, but at the database level, the DBA is responsible for that. A typical database supports more than just a single developer’s application. Only the DBA who supports all of the applications can make the right decision.

Allocating space and managing the files is the job of the DBA. Developers contribute their estimations for space (how much they feel they will need), but the DBA/SA takes care of the rest.

Basically, developers do not need to know how to run the database. They need to know how to run in the database. The developer and the DBA work together on different pieces of the same puzzle. The DBA will visit you, the developer, when your queries are consuming too many resources, and you will visit the DBA when you can’t figure out how to make the system go any faster (that’s when instance tuning can be done, when the application is fully tuned).

This all varies by environment, but I would like to think that there is a division. A good developer is usually a very bad DBA, and vice versa. They are two different skill sets, two different mindsets, and two different personalities in my opinion.

Summary

In this chapter, we have taken a somewhat anecdotal look at why you need to know the database. The examples I presented are not isolated—they happen every day, day in and day out. I observe a continuous cycle of such issues happening, over and over.

Let’s quickly recap the key points. If you are developing with Oracle:

You need to understand the Oracle architecture. You don’t have to know it so well that you are able to rewrite the server, but you should know it well enough that you are aware of the implications of using a particular feature.
You need to understand locking and concurrency control and that every database implements these features differently. If you don’t, your database will give “wrong” answers, and you will have large contention issues, leading to poor performance.
Do not treat the database as a black box—something you need not understand. The database is the most critical piece of most applications. Trying to ignore it would be fatal.
Do not reinvent the wheel. I’ve seen more than one development team get into trouble, not only technically but on a personal level, due to a lack of awareness as to what Oracle provides for free. This happens when someone points out that the feature they just spent the last couple of months implementing was actually a core feature of the database all along. Read the Oracle Database Concepts guide—the New Features guide—the documentation that comes free with the software!
Solve problems as simply as possible, using as much of Oracle’s built-in functionality as possible. You paid a lot for it.
Software projects come and go, as do programming languages and frameworks. We developers are expected to have systems up and running in weeks, maybe months, and then move on to the next problem. If you reinvent the wheel over and over, you will never come close to keeping up with the frantic pace of development. Just as you would never build your own hash table class in Java—since it comes with one—you should use the database functionality you have at your disposal. The first step to being able to do that, of course, is to understand what it is you have at your disposal. Read on.

And building on that last point, software projects and programming languages may come and go—but the data is here forever. We build applications that use data, and that data will be used by many applications over time. It is not about the application—it is about the data. Use techniques and implementations that permit the data to be used and reused. If you use the database as a bit bucket, making it so that all access to any data must come through your application, you have missed the point. You can’t “ad hoc query” your application. You can’t build a new application on top of your old application. But if you use the database, you’ll find adding new applications, reports, or whatever to be much easier over time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 1. Developing Successful Oracle Applications

Create new playlist

Sign In

Sign Up

1. Developing Successful Oracle Applications

My Approach

The Black Box Approach

How (and How Not) to Develop Database Applications

Understanding Oracle Architecture

Use a Single Connection in Oracle

Use Bind Variables

Understanding Concurrency Control

Implementing Locking

Preventing Lost Updates

Multiversioning

Flashback

Read Consistency and Nonblocking Reads

Database Independence?

The Impact of Standards

Make Sure You Can Adapt

Layered Programming

Features and Functions

Knowing What’s Out There

Solving Problems Simply

Openness

How Do I Make It Run Faster?

The DBA-Developer Relationship

Summary

Table of Contents for
1. Developing Successful Oracle Applications