This Apress imprint is published by the registered company APress Media, LLC part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
The inspiration for the material contained in this book comes from my experiences developing Oracle software and from working with fellow Oracle developers and DBAs to help them build reliable and robust applications based on the Oracle database. The book is basically a reflection of what I do every day and of the issues I see people encountering each and every day.
I covered what I felt was most relevant, namely, the Oracle database and its architecture. I could have written a similarly titled book explaining how to develop an application using a specific language and architecture—for example, one using JavaServer Pages that speaks to Enterprise JavaBeans, which in turn uses JDBC to communicate with Oracle. However, at the end of the day, you really do need to understand the topics covered in this book in order to build such an application successfully. This book deals with what I believe needs to be universally known to develop successfully with Oracle, whether you are a Visual Basic programmer using ODBC, a Java programmer using EJBs and JDBC, or a Perl programmer using DBI Perl. This book does not promote any specific application architecture; it does not compare three-tier to client/server. Rather, it covers what the database can do and what you must understand about the way it works. Since the database is at the heart of any application architecture, the book should have a broad audience.
As the title suggests, Expert Oracle Database Architecture concentrates on the database architecture and how the database itself works. I cover the Oracle database architecture in depth: the files, memory structures, and processes that comprise an Oracle database and instance. I then move on to discuss important database topics such as locking, concurrency controls, how transactions work, and redo and undo, and why it is important for you to know about these things. Lastly, I examine the physical structures in the database such as tables, indexes, and datatypes, covering techniques for making optimal use of them.
One of the problems with having plenty of development options is that it’s sometimes hard to figure out which one might be the best choice for your particular needs. Everyone wants as much flexibility as possible (as many choices as they can possibly have), but they also want things to be very cut and dried—in other words, easy. Oracle presents developers with almost unlimited choice. No one ever says, “You can’t do that in Oracle.” Rather, they say, “How many different ways would you like to do that in Oracle?” I hope that this book will help you make the correct choice.
This book is aimed at those people who appreciate the choice but would also like some guidelines and practical implementation details on Oracle features and functions. For example, Oracle has a really neat feature called parallel execution. The Oracle documentation tells you how to use this feature and what it does. Oracle documentation does not, however, tell you when you should use this feature and, perhaps even more important, when you should not use this feature. It doesn’t always tell you the implementation details of this feature, and if you’re not aware of them, this can come back to haunt you (I’m not referring to bugs, but the way the feature is supposed to work and what it was really designed to do).
In this book, I strove to not only describe how things work but also explain when and why you would consider using a particular feature or implementation. I feel it is important to understand not only the “how” behind things but also the “when” and “why” as well as the “when not” and “why not!”
The target audience for this book is anyone who develops applications with Oracle as the database back end. It is a book for professional Oracle developers who need to know how to get things done in the database. The practical nature of the book means that many sections should also be very interesting to the DBA. Most of the examples in the book use SQL*Plus to demonstrate the key features, so you won’t find out how to develop a really cool GUI—but you will find out how the Oracle database works, what its key features can do, and when they should (and should not) be used.
This book is for anyone who wants to get more out of Oracle with less work. It is for anyone who wants to see new ways to use existing features. It is for anyone who wants to see how these features can be applied in the real world (not just examples of how to use the feature, but why the feature is relevant in the first place). Another category of people who would find this book of interest is technical managers in charge of the developers who work on Oracle projects. In some respects, it is just as important that they understand why knowing the database is crucial to success. This book can provide ammunition for managers who would like to get their personnel trained in the correct technologies or ensure that personnel already know what they need to know.
Knowledge of SQL: You don’t have to be the best SQL coder ever, but a good working knowledge will help.
An understanding of PL/SQL: This isn’t a prerequisite, but it will help you to absorb the examples. This book will not, for example, teach you how to program a FOR loop or declare a record type; the Oracle documentation and numerous books cover this well. However, that’s not to say that you won’t learn a lot about PL/SQL by reading this book. You will. You’ll become very intimate with many features of PL/SQL, you’ll see new ways to do things, and you’ll become aware of packages/features that perhaps you didn’t know existed.
Exposure to some third-generation language (3GL), such as C or Java: I believe that anyone who can read and write code in a 3GL language will be able to successfully read and understand the examples in this book.
Familiarity with the Oracle Database Concepts manual.
The structures in the database and how data is organized and stored
Distributed processing
Oracle’s memory architecture
Oracle’s process architecture
Schema objects you will be using (tables, indexes, clusters, and so on)
Built-in datatypes and user-defined datatypes
SQL stored procedures
How transactions work
The optimizer
Data integrity
Concurrency control
I will come back to these topics myself time and time again. These are the fundamentals. Without knowledge of them, you will create Oracle applications that are prone to failure. I encourage you to read through the manual and get an understanding of some of these topics.
This book has 15 chapters, and each is like a “minibook”—a virtually stand-alone component. Occasionally, I refer to examples or features in other chapters, but you could pretty much pick a chapter out of the book and read it on its own. For example, you don’t have to read Chapter 10 on database tables to understand or make use of Chapter 14 on parallelism.
An introduction to the feature or capability.
Why you might want to use the feature or capability (or not). I outline when you would consider using this feature and when you would not want to use it.
How to use this feature. The information here isn’t just a copy of the material in the SQL reference; rather, it’s presented in a step-by-step manner: here is what you need, here is what you have to do, and these are the switches you need to go through to get started. Topics covered in this section will include
How to implement the feature
Examples, examples, examples
How to debug this feature
Caveats of using this feature
How to handle errors (proactively)
A summary to bring it all together
There will be lots of examples and lots of code, all of which is available for download from the GitHub site. The following sections present a detailed breakdown of the content of each chapter.
The best way to digest the material in this book is to thoroughly work through and understand the hands-on examples. As you work through the examples in this book, you may decide that you prefer to type in all the code by hand. Many readers choose to do this because it is a good way to get familiar with the coding techniques that are being used. Having said that, there are many complex examples in this book. Therefore, you may opt for downloading the source code and running examples without having to manually type them in.
Go to the book’s product page on Apress.com, located at www.apress.com/9781484274989 .
There will be a button marked Download Source Code. Click this to be taken to the book’s page on GitHub.
Once on GitHub, download the code as a zip using the green button, or, if you have a GitHub account, you can clone the source code directly to your machine using Git.
That’s it!
Source code can be continuously updated after a book has published. That means that if there are any corrections, you will always get the latest version. If for any reason you want to get hold of the original source code, exactly as it is in your copy of the book, you can go to https://github.com/Apress/ [repository-name-here]/releases and download release v1.0.
If you like to type in the code, you can use the source code files to check the results you should be getting—they should be your first stop if you think you might have typed an error. If you don’t like typing, then downloading the source code from the GitHub site is a must! Either way, the code files will help you with updates and debugging.
If you have any problems accessing the source code for an Apress book, email [email protected].
Accessing an Oracle database
How to set up the EODA account used for many of the examples in this book
How to set up the SCOTT/TIGER demonstration schema properly
Installing Statspack
Installing and running runstats, creating the BIG_TABLE, and other custom utilities used throughout the book
As described previously in the “Where Can I Find the Book’s Source Code?” section, all of the scripts used in this book are available for download from the GitHub site. There is a chNN folder that contains the scripts for each chapter (where NN is the number of the chapter). The ch00 folder contains the scripts listed here in the “Setting Up Your Environment” section.
Most of the examples in this book are designed to run 100 percent in the SQL*Plus environment. If you already have access to an Oracle database, then you can skip ahead to creating the EODA and SCOTT schemas in your database. You’ll also need to set up Statspack and the custom scripts. These components are used extensively throughout the book.
Installing Oracle VM VirtualBox and a pre-built database VM
Installing Oracle VM VirtualBox, cloning a Git repository, and running Vagrant to build your environment
I’ll briefly describe both of the prior techniques in the following sections.
One of the quickest free and easy ways to gain access to a fully functional Oracle database is to download and install Oracle VM VirtualBox and use it with a pre-built database VM. You can literally have a working database within a few minutes of downloading and installing the required software.
First, you must download and install VirtualBox. To do this, go to this link and download and install the software:
www.virtualbox.org/wiki/Downloads
After you have downloaded and installed VirtualBox, then download a pre-built database VM and follow the instructions for importing the appliance VM into VirtualBox. Use this link to download the pre-built VM:
www.oracle.com/downloads/developer-vm/community-downloads.html
Using Oracle VM VirtualBox with a pre-built VM is by far the easiest way to gain access to a fully functional Oracle database. If you are a bit more technically savvy, then I would suggest using a Vagrant box described in the next section to build an environment where you can access an Oracle database on your PC.
This approach requires that you download and install Oracle VM VirtualBox, Git, and Vagrant. You also need to download the Oracle installation media. After you’ve installed those, then use Git to clone a Vagrant repository and then use a Vagrant box to build a virtual machine on your laptop. This approach might seem a little daunting at first, so I would suggest you look up Tim Hall’s YouTube video titled “Vagrant: Oracle Database Build.” That YouTube video walks you through the entire process.
Described next are the high-level steps for building an Oracle environment. First, navigate to Oracle’s database download site and download the Oracle installation software:
www.oracle.com/database/technologies/oracle-database-software-downloads.html
Now navigate to this link and download and install Oracle VM VirtualBox on your laptop:
www.virtualbox.org/wiki/Downloads
Next, navigate to the Git download page and download and install Git on your laptop:
Next, navigate to the Vagrant download page and download and install Vagrant on your laptop:
www.vagrantup.com/docs/installation
Using a Vagrant box is an extremely powerful way to create your own VMs that contain Oracle databases. You can even easily build a RAC database environment from scratch using these techniques.
Don’t worry if you don’t have access to a container/pluggable database to run the examples. If you’re not using a multitenant database, then all of your connections in the code examples will be to the database itself (since there is no concept of a pluggable database in the older single-tenant Oracle database architecture).
The Oracle database architecture types such as single tenant and multitenant are explained in Chapter 2.
SYS: This is an Oracle-created user that has all database privileges. I’ll use this to start/stop the database, add tablespaces, modify initialization parameters, and so on.
SYSTEM: This is an Oracle-created user that has elevated database privileges. I’ll use this to create users, grant privileges, and so on.
EODA: This is a user that I created that has a variety of special database privileges granted to it. These privileges are required to demonstrate various concepts.
SCOTT: This is a user that I created using scripts provided by Oracle. Historically, this user has been used to explain simple database concepts and is the owner of the EMP and DEPT tables.
You can set up whatever user (schema) you want to run the examples in this book. I picked the username EODA simply because it’s an acronym for the title of the book.
The SCOTT/TIGER schema will sometimes already exist in your database. This schema is often used to show basic examples especially when you require a couple of tables with primary and foreign key relationships (the EMP and DEPT tables). There is nothing magic about using the SCOTT account. You could install the EMP/DEPT tables directly into your own database account if you wish.
Having said that, many of my examples in this book draw on the tables in the SCOTT schema. If you would like to be able to work along with them, you will need these tables. If you are working on a shared database, it would be advisable to install your own copy of these tables in some account other than SCOTT to avoid side effects caused by other users mucking about with the same data.
Many of my examples in this book draw on the tables in the SCOTT schema. If you would like to be able to work along with them, you will need these tables. If you are working on a shared database, it would be advisable to install your own copy of these tables in some account other than SCOTT to avoid side effects caused by other users mucking about with the same data.
Statspack is designed to be installed when connected to the root container database as SYS (CONNECT/AS SYSDBA) or as a user granted the SYSDBA privilege. In many installations, installing Statspack will be a task that you must ask the DBA or administrators to perform.
The password you would like to use for the PERFSTAT schema that will be created
The default tablespace you would like to use for PERFSTAT
The temporary tablespace you would like to use for PERFSTAT
The script will prompt you for the needed information as it executes. In the event you make a typo or inadvertently cancel the installation, you should use spdrop.sql found in $ORACLE_HOME/rdbms/admin to remove the user and installed views prior to attempting another install of Statspack. The Statspack installation will create a file called spcpkg.lis. You should review this file for any possible errors that might have occurred. The user, views, and PL/SQL code should install cleanly, however, as long as you supplied valid tablespace names (and didn’t already have a user PERFSTAT).
Statspack is documented in the following text file: $ORACLE_HOME/rdbms/admin/spdoc.txt.
In this section, I will describe the requirements (if any) needed by various scripts used throughout this book. As well, we will investigate the code behind the scripts.
Wall clock or elapsed time: This is useful to know, but not the most important piece of information.
System statistics: This shows, side by side, how many times each approach did something (such as a parse call) and the difference between the two.
Latching: This is the key output of this report.
As we’ll see in this book, latches are a type of lightweight lock. Locks are serialization devices. Serialization devices inhibit concurrency. Applications that inhibit concurrency are less scalable, can support fewer users, and require more resources. Our goal is always to build applications that have the potential to scale—ones that can service 1 user as well as 1000 or 10,000. The less latching we incur in our approaches, the better off we will be. I might choose an approach that takes longer to run on the wall clock but that uses ten percent of the latches. I know that the approach that uses fewer latches will scale substantially better than the approach that uses more latches.
Runstats is best used in isolation, that is, on a single-user database. We will be measuring statistics and latching (locking) activity that result from our approaches. We do not want other sessions to contribute to the system’s load or latching while this is going on. A small test database is perfect for these sorts of tests. I frequently use my desktop PC or laptop, for example.
I believe all developers should have a test bed database they control to try ideas on, without needing to ask a DBA to do something all of the time. Developers definitely should have a database on their desktop, given that the licensing for the personal developer version is simply “use it to develop and test with, do not deploy, and you can just have it.” This way, there is nothing to lose! Also, I’ve taken some informal polls at conferences and seminars. Virtually every DBA out there started as a developer! The experience and training developers could get by having their own database—being able to see how it really works—pays dividends in the long run.
The actual object names you need to be granted access to will be V_$STATNAME, V_$MYSTAT, and so on—that is, the object name to use in the grant will start with V_$ not V$. The V$ name is a synonym that points to the underlying view with a name that starts with V_$. So, V$STATNAME is a synonym that points to V_$STATNAME—a view. You need to be granted access to the view.
You can either have SELECT on V$STATNAME, V$MYSTAT, V$TIMER, and V$LATCH granted directly to you (so you can create the view yourself), or you can have someone that does have SELECT on those objects create the view for you and grant SELECT privileges on the view to you.
RS_START (Runstats Start) to be called at the beginning of a Runstats test
RS_MIDDLE to be called in the middle, as you might have guessed
RS_STOP to finish off and print the report
The parameter, p_difference_threshold, is used to control the amount of data printed at the end. Runstats collects statistics and latching information for each run and then prints a report of how much of a resource each test (each approach) used and the difference between them. You can use this input parameter to see only the statistics and latches that had a difference greater than this number. By default, this is zero, and you see all of the outputs.
Next is the RS_MIDDLE routine. This procedure simply records the elapsed time for the first run of our test in G_RUN1. Then it inserts the current set of statistics and latches. If we were to subtract these values from the ones we saved previously in RS_START, we could discover how many latches the first method used, how many cursors (a statistic) it used, and so on.
This confirms you have the RUNSTATS_PKG package installed and shows you why you should use a single SQL statement instead of a bunch of procedural code when developing applications whenever possible!
This shows our UPDATE of 1000 rows generated 97,748 bytes of redo.
P_SEGNAME: Name of the segment—the table or index name, for example.
P_OWNER: Defaults to the current user, but you can use this routine to look at some other schema.
P_TYPE: Defaults to TABLE and represents the type of object you are looking at. For example, select distinct segment_type from dba_segments lists valid segment types.
P_PARTITION: Name of the partition when you show the space for a partitioned object. SHOW_SPACE shows space for only a partition at a time.
Unformatted Blocks: The number of blocks that are allocated to the table below the high-water mark, but have not been used. Add unformatted and unused blocks together to get a total count of blocks allocated to the table but never used to hold data in an ASSM object.
FS1 Blocks–FS4 Blocks: Formatted blocks with data. The ranges of numbers after their name represent the emptiness of each block. For example, (0-25) is the count of blocks that are between 0 and 25 percent empty.
Full Blocks: The number of blocks that are so full that they are no longer candidates for future inserts.
Total Blocks, Total Bytes, Total Mbytes: The total amount of space allocated to the segment measured in database blocks, bytes, and megabytes.
Unused Blocks, Unused Bytes: Represents a portion of the amount of space never used. These are blocks allocated to the segment, but are currently above the high-water mark of the segment.
Last Used Ext FileId: The file ID of the file that contains the last extent that contains data.
Last Used Ext BlockId: The block ID of the beginning of the last extent; the block ID within the last used file.
Last Used Block: The block ID offset of the last block used in the last extent.
For examples throughout this book, I use a table called BIG_TABLE. The code for creating this table is contained in the big_table.sql script. Depending on which system I use, this table has between one record and four million records and varies in size from 200MB to 800MB. In all cases, the table structure is the same.
Creates an empty table based on ALL_OBJECTS. This dictionary view is used to populate the BIG_TABLE.
Makes this table NOLOGGING. This is optional. I did it for performance. Using NOLOGGING mode for a test table is safe; you won’t use it in a production system, so features like Oracle Data Guard will not be enabled.
Populates the table by seeding it with the contents of ALL_OBJECTS and then iteratively inserting into itself, approximately doubling its size on each iteration.
Creates a primary key constraint on the table.
Gathers statistics.
I estimated baseline statistics on the table. The index associated with the primary key will have statistics computed automatically when it is created.
SQL sees ename = ENAME and compares the ENAME column to itself (of course). We could use ename = P.ENAME, that is, qualify the reference to the PL/SQL variable with the procedure name, but this is too easy to forget, leading to errors.
I just always name my variables after the scope. That way, I can easily distinguish parameters from local variables and global variables, in addition to removing any ambiguity with respect to column names and variable names.
Apress makes every effort to make sure that there are no errors in the text or the code. However, to err is human, and as such we recognize the need to keep you informed of any mistakes as they’re discovered and corrected. Errata sheets are available for all our books at www.apress.com . If you find an error that hasn’t already been reported, please let us know. The Apress website acts as a focus for other information and support, including the code from all Apress books, sample chapters, previews of forthcoming titles, and articles on related topics.
I would like to thank many people for helping me complete this book.
First, I would like to thank you, the reader of this book. There is a high probability that if you are reading this book, you have participated in my site http://asktom.oracle.com in some fashion, perhaps by asking a question or two. It is that act—the act of asking questions and of questioning the answers—that provides me with the material for the book and the knowledge behind the material. Without the questions, I would not be as knowledgeable about the Oracle database as I am. So, it is you who ultimately makes this book possible.
I would like to thank Tony Davis for his previous work making my work read well. If you enjoy the flow of the sections, the number of section breaks, and the clarity, then that is in some part due to him. I have worked with Tony writing technical material since the year 2000 and have watched his knowledge of Oracle grow over that time. He now has the ability to not only edit the material but in many cases tech edit it as well. Many of the examples in this book are there because of him (pointing out that the casual reader was not going to “get it” without them). This book would not be what it is without him.
Without a technical review team of the caliber I had during the writing of this book and the previous editions, I would be nervous about the content. The first edition had Jonathan Lewis, Roderick Manalac, Michael Möller, and Gabe Romanescu as technical reviewers. They spent many hours poring over the material and verifying it was technically accurate as well as useful in the real world. Subsequent editions had a team of similar caliber: Melanie Caffrey, Christopher Beck, and Jason Straub. I firmly believe a technical book should be judged not only by who wrote it but also by who reviewed it. Given these seven people, I feel confident in the material.
At Oracle, I work with the best and brightest people I have ever known, and they all have contributed in one way or another. I would like to thank Ken Jacobs in particular for his support and enthusiasm over the years. Ken is unfortunately (for us) no longer with Oracle Corporation, but his impact will long be felt.
Lastly, but most important, I would like to acknowledge the unceasing support I’ve received from my family. You know you must be important to someone when you try to do something that takes a lot of “outside of work hours” and that someone lets you know about it. Without the continual support of my wife, Melanie (who also was a technical reviewer on the book), son Alan, and daughter Megan, I don’t see how I could have finished this book.
—Thomas Kyte
I’d like to thank Tom for inviting me to work with him on this book; this is a great technical honor. I’d also like to acknowledge Jonathan Gennick; his guidance (over many years and books) laid the foundation for me being able to work on a book of this caliber. And I’d like to thank Heidi, Lisa, Evan, and Brandi; without their support, I could not have successfully participated.
—Darl Kuhn
18.191.202.240