C H A P T E R  13

images

Data Pump

Data Pump was introduced in Oracle Database 10g. It replaces the older exp/imp utilities. Data Pump enables you to quickly move data and/or metadata from one environment to another. You can use Data Pump in a variety of ways:

  • Point-in-time logical backups of the entire database or subsets of data
  • Replicating entire databases or subsets of data for testing or development
  • Quickly generating Data Definition Language (DDL) required to re-create objects

Sometimes DBAs hold on to the old exp/imp utilities because they're familiar with the syntax and these utilities get the job done quickly. Even if those legacy utilities are easy to use, you should consider using Data Pump going forward. Data Pump contains substantial functionality over the old exp/imp utilities:

  • Performance with large data sets, allowing you to efficiently export and import gigabytes of data
  • Interactive command-line utility, which gives you the ability to disconnect and then later re-attach to active Data Pump jobs
  • Ability to export and import large amounts of data from a remote database directly into a local database without creating a dump file
  • Ability to make on-the-fly changes to schemas, tablespaces, datafiles, and storage settings from export to import
  • Sophisticated filtering of objects and data
  • Security controlled via database-directory objects
  • Advanced features such as compression and encryption

This chapter begins with a discussion on the Data Pump architecture. Subsequent topics include basic export and import tasks, moving data across networks, filtering data, and running Data Pump in legacy mode.

Data Pump Architecture

Data Pump consists of the following components:

  • expdp (Data Pump export utility)
  • impdp (Data Pump import utility)
  • DBMS_DATAPUMP PL/SQL package (Data Pump API)
  • DBMS_METADATA PL/SQL package (Data Pump Metadata API)

The expdp and impdp utilities use the DBMS_DATAPUMP and DBMS_METADATA built-in PL/SQL packages when exporting and importing data and metadata. The DBMS_DATAPUMP package moves entire databases or subsets of data between database environments. The DBMS_METADATA package exports and imports information about database objects.

images Note You can call the DBMS_DATAPUMP and DBMS_METADATA packages independently (outside of expdp and impdp) from SQL*Plus. I rarely call these packages directly from SQL*Plus; but you may have a specific scenario where it's desirable to interact directly with them. See the Oracle Database PL/SQL Packages and Types Reference guide (available on OTN) for more details.

When you start a Data Pump export or import job, a master operating-system process is initiated on the database server. This master process name has the format ora_dmNN_<SID>. On Linux/Unix systems, you can view this process from the operating-system prompt using the ps command:

$ ps -ef | grep ora_dm                                                   
  oracle 14950   717   0 10:59:06 ?           0:10 ora_dm00_STAGE

Depending on the degree of parallelism and the work specified, a number of worker processes are also started. The master process coordinates the work between master and worker processes. The worker process names have the format ora_dwNN_<SID>.

Also, when a user starts an export or import job, a database status table is created (owned by the user who starts the job). This table exists for the duration of the Data Pump job. The name of the status table is dependent on what type of job you're running. The table is named with the format SYS_<OPERATION>_<JOB_MODE>_NN, where OPERATION is either EXPORT or IMPORT. JOB_MODE can be FULL, SCHEMA, TABLE, TABLESPACE, and so on.

For example, if you're exporting a schema, a table is created in your account with the name SYS_EXPORT_SCHEMA_NN, where NN is a number that makes the table name unique in the user's schema. This status table contains information such as the objects exported/imported, start time, elapsed time, rows, error count, and so on. The status table has over 80 columns.

The status table is dropped by Data Pump upon successful completion of an export or import job. If you use the KILL_JOB interactive command, the master table is also dropped. If you stop a job with the STOP_JOB interactive command, the table isn't removed and is used in the event you restart the job.

If your job terminates abnormally, the master table is retained. You can delete the status table if you don't plan to restart the job.

When Data Pump runs, it uses a database-directory object to determine where to write and read dump files and log files. Usually, you specify which directory object you want Data Pump to use. If you don't specify a directory object, a default is used.

A Data Pump export creates an export file and a log file. The export file contains the objects being exported. The log file contains a record of the job activities.

Figure 13–1 shows the architectural components related to a Data Pump export job. Here's how you initiate the job from the command line:

$ expdp user/pwd dumpfile=exp.dmp logfile=exp.log directory=dp_dir

In this example, a database-directory object named DP_DIR is defined to reference the /oradump operating-system directory. The output files are defined via the command line to be exp.dmp and exp.log.

images

Figure 13–1. Data Pump export job components

Figure 13–2 displays the architectural components of a Data Pump import job. Here's how you initiate the job from the command line:

$ impdp user/pwd dumpfile=exp.dmp logfile=imp.log directory=dp_dir

In this example, the Data Pump import reads from a dump file named exp.dmp, which is located in the operating-system directory referenced by the database-directory object named DP_DIR. The import job reads the dump file and populates database objects.

images

Figure 13–2. Data Pump import job components

For each Data Pump job, you must ensure that you have access to a directory object. The basics of exporting and importing are described in the next few sections.

Exporting Data

A small amount of setup is required when you run a Data Pump export job. Here are the steps:

  1. Create a database-directory object that points to an operating-system directory that you want to write/read Data Pump files to/from.
  2. Grant read, write on the directory to the database user running the export.
  3. From the operating-system prompt, run the expdp utility.

Creating a Database Directory

Before you run a Data Pump job, you must create a database-directory object that corresponds to a physical location on disk that specifies where the dump files and log files are created. Use the CREATE DIRECTORY command to accomplish this. This example creates a directory named dp_dir and specifies that it is to map to the /oradump physical location on disk:

SQL> create directory dp_dir as '/oradump';

To view the details of the newly created directory, issue this query:

SQL> select owner, directory_name, directory_path from dba_directories;

By default, when you install Oracle, one default directory object is created named DATA_PUMP_DIR. If you don't specify the DIRECTORY parameter when exporting or importing, Oracle by default attempts to use the default database-directory object. The default directory associated with DATA_PUMP_DIR can vary depending on the version of Oracle. On some systems, it may be ORACLE_HOME/rdbms/log; on other systems, it may point to ORACLE_BASE/admin/ORACLE_SID/dpdump. You have to inspect DBA_DIRECTORIES to verify the default location for your system.

Granting Access to the Directory

You need to grant permissions on the database-directory object to a user that wants to use Data Pump. Use the GRANT statement to allocate the appropriate privileges. If you want a user to be able to read from and write to the directory, you must grant security access as follows:

SQL> grant read, write on directory dp_dir to darl;

All directory objects are owned by the SYS user. If you're using a user account that has the DBA role granted to it, then you have the read and write privileges on the directory object. I usually perform Data Pump jobs with a user that has DBA granted to it (so I don't need to bother with granting access).

Taking an Export

When the directory object and grants are in place, you can use Data Pump to export information from a database. DBAs typically use exports for point-in-time backups of data and metadata. You can use these exports to either restore database objects or move data to different database environments. Suppose you recently created a table and populated it with data:

SQL> create table inv(inv_id number);
SQL> insert into inv values (123);

Now, you want to export the table. This example uses the previously created directory named DP_DIR. Data Pump uses the directory path specified by the directory object as the location on disk to write the dump file and log file:

$ expdp darl/foo directory=dp_dir tables=inv dumpfile=exp.dmp logfile=exp.log

The expdp job creates a file named exp.dmp in the /oradump directory that contains the information required to re-create the INV table and populate it with data as of the time the export was taken. In addition, a log file named exp.log is created in the /oradump directory that contains all the logging information associated with this export job.

If you don't specify a dump-file name, Data Pump creates a file named expdat.dmp. If a file named expdat.dmp already exists in the directory, then Data Pump throws an error. If you don't specify a log-file name, then Data Pump creates one named export.log. If a file already exists (named export.log), then Data Pump overwrites it.

Importing Data

One of the key reasons to export data is so that you can re-create database objects in other environments. Data Pump import uses an export dump file as its input and re-creates database objects contained in the export file. The procedure for importing is similar to exporting:

  1. Create a database-directory object that points to an operating-system directory that you want to write/read Data Pump files to/from.
  2. Grant read, write on the directory to the database user running the export or import.
  3. From the operating system prompt, run the impdp command.

Suppose you accidentally drop the INV table that was previously created:

SQL> drop table inv purge;

You now want to re-create the INV table from the previous export. This example uses the same directory object that was created previously:

$ impdp darl/foo directory=dp_dir dumpfile=exp.dmp logfile=inv.log

You should now have the INV table re-created and populated with data as it was at the time of the export.

SECURITY ISSUES WITH THE OLD EXP UTILITY

Interactive Command Mode

Data Pump provides an interactive command mode that allows you to monitor the status of a Data Pump job and modify on-the-fly a number of job characteristics. The interactive command mode is most useful for long-running Data Pump operations. In this mode, you can also stop, re-start, or terminate a currently running job. Each of these activities is discussed in the following subsections.

Entering Interactive Command Mode

There are two ways to get to the interactive command-mode prompt:

  • Press Ctrl+C in a Data Pump job that you started via expdp or impdp.
  • Use the ATTACH parameter to attach to a currently running job

When you run a Data Pump job from the command line, you're placed in the command-line mode. You should see output displayed to your terminal as a job progresses. If you want to exit command-line mode, press Ctrl+C. This places you in the interactive command-interface mode. For an export job, the prompt is

Export>

Type in the HELP command to view the export interactive commands available (see Table 13–1 for a description):

Export> help

Type EXIT to leave interactive command mode:

Export> exit

You should now be at the operating-system prompt.

Table 13–1. Export Interactive Commands

Command Description
ADD_FILE Adds files to the export dump set.
CONTINUE_CLIENT Continues with interactive client mode.
EXIT_CLIENT Exits the client session and returns to the operating-system prompt. Leaves the current job running.
FILESIZE Defines file size for any subsequently created dump files.
HELP Displays interactive export commands.
KILL_JOB Terminates the current job.
PARALLEL Increases or decreases the degree of parallelism.
START_JOB Restarts the attached job.
STATUS Displays the status of the currently attached job.
STOP_JOB [=IMMEDIATE] Stops a job from processing (you can later restart it). Using the IMMEDIATE parameter quickly stops the job, but there may be some incomplete tasks.

You can press Ctrl+C for either an export or import job. For an import job, the interactive command-mode prompt is

Import>

To view all commands available, type HELP:

Import> help

The interactive command-mode import commands are summarized in table 13–2.

Table 13–2. Import Interactive Commands

Command Description
CONTINUE_CLIENT Continues with interactive logging mode.
EXIT_CLIENT Exits the client session and returns to the operating-system prompt. Leaves the current job running.
HELP Displays the available interactive commands.
KILL_JOB Terminates the job it's currently connected to in the client.
PARALLEL Increases or decreases the degree of parallelism.
START_JOB Restarts a previously stopped job. START_JOB=SKIP_CURRENT restarts the job and skips any operations that were active when the job was stopped.
STATUS Specifies the frequency at which the job status is monitored. Default mode is 0; the client reports job status changes whenever available in this mode.
STOP_JOB [=IMMEDIATE] Stops a job from processing (you can later restart it). Using the IMMEDIATE parameter quickly stops the job, but there may be some incomplete tasks.

Type EXIT to leave the Data Pump status utility:

Import> exit

You should now be at the operating-system prompt.

Attaching to a Running Job

One powerful feature of Data Pump is that you can attach to a currently running job and view its progress and status. If you have DBA privileges, you can even attach to a job if you aren't the owner. You can attach to either an import or an export job via the ATTACH parameter.

Before you attach to a job, you must first determine the Data Pump job name (and owner name if you're not the owner of the job). Run the following SQL query to display currently running jobs:

SQL> select owner_name, operation, job_name, state from dba_datapump_jobs;

Here's some sample output:

OWNER_NAME      OPERATION       JOB_NAME             STATE
--------------- --------------- -------------------- ---------------
DARL            IMPORT          SYS_IMPORT_SCHEMA_02 EXECUTING
DARL            IMPORT          SYS_IMPORT_SCHEMA_01 NOT RUNNING

In this example, you're the owner of the job, so you use the ATTACH parameter without prepending the owner name to it. This is an import job, so you use the impdp command to attach to the job name SYS_IMPORT_SCHEMA_02:

$ impdp darl/engdev attach=sys_import_schema_02  

If you aren't the owner of the job, you attach to the job by specifying the owner name and the job name:

$ impdp system/foobar attach=darl.sys_import_schema_02

You should now see the Data Pump command-line prompt:

Import>

Type STATUS to view the status of the currently attached job:

Import> status

Stopping and Restarting a Job

If you have a currently running Data Pump job that you want to temporarily stop, you can do so by first attaching to the interactive command mode. You may want to stop a job to resolve space issues or performance issues and then, after resolving the issues, restart the job. This example attaches to an import job:

$ impdp darl/foo attach=sys_import_table_01

Now, stop the job using the STOP_JOB parameter:

Import> stop_job

You should see this output:

Are you sure you wish to stop this job ([yes]/no):

Type YES to proceed with stopping the job. You can also specify that the job be stopped immediately:

Import> stop_job=immediate

When you stop a job with the IMMEDIATE option, there may be some incomplete tasks associated with the job. To restart a job, attach to interactive command mode and issue the START_JOB command:

Import> start_job

If you want to resume logging job output to your terminal, issue the CONTINUE_CLIENT command:

Import> continue_client

Terminating a Data Pump Job

You can instruct Data Pump to permanently kill an export or import job. First, attach to the job in interactive command mode, and then issue the KILL_JOB command:

Import> kill_job

You should be prompted with the following:

Are you sure you wish to stop this job ([yes]/no):

Type YES to permanently kill the job. Data Pump unceremoniously kills the job and drops the associated status table from the user running the export or import.

Tips for Getting Started

This section describes some common Data Pump features that I regularly use. These techniques can assist you in minimizing command-line errors and help verify what was exported or what objects and metadata are imported.

Use a Parameter File

Instead of typing commands on the command line, in many situations it's preferable to store the commands in a file and then reference the file when executing Data Pump export or import. Using parameter files makes tasks more repeatable and less prone to errors. You can place the commands in a file once and then reference that file multiple times.

Additionally, some of the Data Pump commands (like FLASHBACK_TIME) require the use of quotation marks; in these situations, it's sometimes hard to predict how the operating system will interpret the quotation marks. Whenever a command requires quotation marks, it's highly preferable to use a parameter file.

To use a parameter file, first create an operating text file that contains the commands you want to use to control the behavior of your job. This example uses the Linux/Unix vi command to create a text file named imp.par:

$ vi imp.par

Now, place the following commands in the imp.par file:

userid=darl/foo
directory=dp
dumpfile=invp.dmp
logfile=invp.log
tables=f_sales

Next, reference the parameter file via the PARFILE command-line option:

$ impdp parfile=pfile.ctl

Data Pump import processes the parameters in the file as if they were typed on the command line. If you find yourself repeatedly typing the same commands, then consider using a parameter file to increase your efficiency.

Estimating the Size of Export Jobs

If you're about to export a large amount of data, you can estimate the size of the file that Data Pump creates before you run the export. You may want to do this because you're concerned about the amount of space an export job needs.

To estimate the size, use the ESTIMATE_ONLY parameter. This example estimates the size of the export file for an entire database:

$ expdp dbauser/foo estimate_only=y full=y logfile=n

Here's a snippet of the output:

Estimate in progress using BLOCKS method...
Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
.  estimated "REP_MV"."REG_QUEUE_REP"                       9.606 GB
.  estimated "REP_MV"."CWP_USER_PROFILE"                 2.589 GB
.  estimated "REP_MV"."PRODUCT_INSTANCES_MV"       1.620 GB
.  estimated "REP_MV"."ROLES_MV"                                   1.550 GB
.  estimated "STAR2"."F_DOWNLOADS":"DOWN_P_5"       1.450 GB
.  estimated "STAR2"."F_DOWNLOADS":"DOWN_P_11"     1.414 GB

Similarly, you can specify a schema name to get an estimate of the size required to export a user:

$ expdp dbauser/foo estimate_only=y schemas=star2 logfile=n

Here's an example of estimating the size required for two tables:

$ expdp dbauser/foo estimate_only=y tables=star2.f_downloads,star2.f_installations
logfile=n

Listing the Contents of Dump Files

Data Pump has a very robust method of creating a file that contains all the SQL that's executed when an import job runs. Data Pump uses the DBMS_METADATA package to create the DDL that you can use to re-create objects in the Data Pump dump file.

Use the SQLFILE option of Data Pump import to list the contents of a Data Pump export file. This example creates a file named expfull.sql that contains the SQL statements that the import process calls (the file is placed in the directory defined by the DPUMP_DIR2 directory object):

$ impdp hr/hr DIRECTORY=dpump_dir1 DUMPFILE=expfull.dmp
SQLFILE=dpump_dir2:expfull.sql

In the previous command, if you don't specify a separate directory (like dpump_dir2 in this example), then the SQL file is written to the location specified in the DIRECTORY option.

images Tip You must run the previous command as a user with DBA privileges or the schema that performed the Data Pump export. Otherwise, you get an empty SQL file without the expected SQL statements in it.

When you use the SQLFILE option with an import, the impdp process doesn't import any data. It only creates a file that contains the SQL commands that would be run by the import process. It's sometimes handy to generate a SQL file for the following reasons:

  • To preview and verify the SQL statements before running the import
  • To manually run the SQL to pre-create database objects
  • To create a DDL file that you can later inspect to see if there are differences in users from exports at different points in time

USING EXP/IMP TO RE-CREATE DDL

Transferring Data

One major way you can use Data Pump is to copy data from one database to another. Often, the source and destination databases may be located in data centers thousands of miles apart. Data Pump offers several powerful features for efficiently copying data:

  • Network link
  • Copying datafiles (transportable tablespaces)
  • External tables (see Chapter 14)

Using a network link allows you to take an export and import it into the destination database without having to create a dump file. This is a very efficient way of moving data.

Oracle also provides the transportable tablespace feature, which lets you copy the datafiles from a source database to the destination and then use Data Pump to transfer the associated metadata. These two techniques are described in this section.

images Note See Chapter 14 for a discussion of using external tables to transfer data.

Exporting and Importing Directly Across the Network

Sometimes you need to create a testing database and load it with production data. In scenarios like this, the production box is usually located remotely from the development box. Data Pump provides you the ability to take an export and directly import it into your target database without creating any intermediate dump files. This is a fast and efficient way to create new environments from existing environments.

An example will help illustrate how this works. For this example, the production database users are STAR2, CIA_APP, and CIA_SEL. You want to move these users into a testing database and name them STAR_JUL, CIA_APP_JUL, and CIA_SEL_JUL.

This task requires the following high-level steps:

  1. Create users in the destination database to be imported into. Here's a sample script that creates the users in the testing database:
    define star_user=star_jul
    define star_user_pwd=star_jul_pwd
    define cia_app_user=cia_jul_dec
    define cia_app_user_pwd=cia_app_jul_pwd
    define cia_sel_user=cia_sel_jul
    define cia_sel_user_pwd=cia_sel_jul_pwd
    --
    create user &&star_user identified by &&star_user_pwd;
    grant connect,resource to &&star_user;
    alter user &&star_user default tablespace dim_data;
    --
    create user &&cia_app_user identified by &&cia_app_user_pwd;
    grant connect,resource to &&cia_app_user;
    alter user &&cia_app_user default tablespace cia_data;
    --
    create user &&cia_sel_user identified by &&cia_app_user_pwd;
    grant connect,resource to &&cia_app_user;
    alter user &&cia_sel_user default tablespace cia_data;
  2. Create a database link in your testing database that points to your production database. The remote user referenced in the CREATE DATABASE LINK statement has the DBA role granted to it in the production database. Here's a sample CREATE DATABASE LINK script:
    create database link dk
    connect to darl identified by foobar
    using 'dwdb1:1522/dwrep1';
  3. Create a directory that points to the location where you want your log file to go:
    SQL> create or replace directory engdev as '/orahome/oracle/ddl/engdev';
  4. Run the import command on the testing box. This command references the remote database via the NETWORK_LINK parameter. This command also instructs Data Pump to map the production database user names to the newly created users in the testing database.
$ impdp darl/engdev directory=engdev network_link=dk
schemas='STAR2,CIA_APP,CIA_SEL'
remap_schema=STAR2:STAR_JUL,CIA_APP:CIA_APP_JUL,CIA_SEL:CIA_SEL_JUL
parallel=4

This technique allows you to move large amounts of data between disparate databases without having to create or copy any dump files or datafiles. This is a very powerful Data Pump feature that lets you quickly and efficiently transfer data.

If you don't have Oracle Net connectivity between the two databases, then the steps to accomplish the same task are as follows:

  1. Export the production database.
  2. Copy the dump file to the testing database.
  3. Import the dump file into the testing database.

Copying Datafile(s)

Oracle provides a mechanism for copying datafiles from one database to another in conjunction with using Data Pump to transport the associated metadata. This is known as the transportable tablespace feature. The amount of time this takes depends on how long it takes you to copy the datafiles to the destination server. This technique is appropriate for moving data in decision-support systems and data-warehousing environments.

Follow these steps to transport tablespaces:

  1. Ensure that the tablespace is self-contained. These are some common violations of the self-contained rule:
    • An index in one tablespace can't point to a table in another tablespace that isn't in the set of tablespaces being transported.
    • A foreign-key constraint is defined on a table in one tablespace that references a primary-key constraint on a table in a tablespace that isn't in the set of tablespaces being transported.

    Run the following check to see if the set of tablespaces being transported violates any of the self-contained rules:

    SQL> exec dbms_tts.transport_set_check('INV_DATA,INV_INDEX', TRUE);

    Now, see if Oracle detected any violations:

    SQL> select * from transport_set_violations;

    If you don't have any violations, you should see:

    no rows selected
  2. Make the tablespaces being transported read-only:
    SQL> alter tablespace inv_data read only;
    SQL> alter tablespace inv_index read only;
  3. Use Data Pump to export the metadata for the tablespaces being transported:
    $ expdp darl/foo directory=dp dumpfile=trans.dmp
    transport_tablespaces=INV_DATA,INV_INDEX
  4. Copy the Data Pump export dump file to the destination server.
  5. Copy the datafile(s) to the destination database. Place the files in the directory where you want them in the destination database server. The filename and directory path must match the import command used in the next step.
  6. Import the metadata into the destination database. Use the following parameter file to import the metadata for the datafiles being transported:
    userid=darl/foo
    directory=dp
    dumpfile=trans.dmp
    transport_datafiles=/ora01/dbfile/O11R2/inv_data01.dbf,
    /ora01/dbfile/O11R2/inv_index01.dbf

If everything goes well, you should see some output indicating success:

Processing object type TRANSPORTABLE_EXPORT/CONSTRAINT/REF_CONSTRAINT
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "DARL"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at 11:24:01

If the datafiles that are being transported have a different block size than the destination database, then you must modify your initialization file (or use an ALTER SYSTEM command) and add a buffer pool that contains the block size of the source database. For example, to add a 16KB buffer cache, place this in the initialization file:

db_16k_cache_size=200M

The transportable tablespace mechanism allows you to quickly move datafiles between databases, even if the databases use different block sizes or have different endian formats. This section doesn't discuss all the details involved with transportable tablespaces; the focus in this chapter is showing how to use Data Pump to transport data. For complete details on transportable tablespaces, refer to the Oracle Database Administrator's Guide available on OTN.

images Note To generate transportable tablespaces, you must use Oracle Enterprise Edition. You can use other editions of Oracle to import transportable tablespaces.

Exporting and Importing Tablespaces and Datafiles

Data Pump contains some flexible features for exporting and importing tablespaces and datafiles. This section shows Data Pump techniques that you can use when working with these important database objects.

Exporting Tablespace Metadata

Sometimes you want to replicate environments—say, a production environment into a testing environment. One of the first tasks is to replicate the tablespaces. You can use Data Pump to pull out just the DDL required to re-create the tablespaces for an environment:

$ expdp darl/foo directory=dp dumpfile=phredstg.dmp content=metadata_only full=y
include=tablespace

The FULL parameter instructs Data Pump to export everything in the database. When you also specify INCLUDE, Data Pump exports only those objects. In this scenario, the dump file only has information in it regarding tablespaces.

Now you can use the SQLFILE parameter to view the DDL associated with the tablespaces that were exported:

$ impdp darl/foo directory=dp dumpfile=phredstg.dmp sqlfile=tbsp.sql

Recall that when you use the SQLFILE parameter, nothing is imported. The previous command only creates a file named tbsp.sql that contains SQL statements. You can modify the DDL and run it in the destination database environment; or, if nothing needs to change, you can directly use the dump file by importing it into the destination database.

Specifying Different Datafile Paths and Names

When you're exporting and then importing, sometimes you import tablespaces into a database where the directory structures are different from the original database. On the import step, the REMAP_DATAFILE allows you to seamlessly change the underlying names of datafiles. Here's a parameter file that contains the remapping statements:

userid=darl/foo
directory=dp
dumpfile=phredstg.dmp
full=y
include=tablespace:"like 'TBSP%'"
remap_datafile="'/ora01/dbfile/O11R2/tbsp101.dbf':'/ora02/O11R2/tb1.dbf'"
remap_datafile="'/ora01/dbfile/O11R2/tbsp201.dbf':'/ora02/O11R2/tb2.dbf'"
remap_datafile="'/ora01/dbfile/O11R2/tbsp301.dbf':'/ora02/O11R2/tb3.dbf'"

When Data Pump creates the tablespaces, for any paths that match the first part of the string (to the left of the colon [:]), the string is replaced with the text to the right of the colon.

Changing Segment and Storage Attributes

When importing, you can alter the storage attributes of a table by using the TRANSFORM parameter. The general syntax for this parameter is

TRANSFORM=transform_name:value[:object_type]

When you use SEGMENT_ATTRIBUTES:n for the transformation name, you can remove the following segment attributes during an import:

  • Physical attributes
  • Storage attributes
  • Tablespaces
  • Logging

Here's an example that removes the segment attributes:

$ impdp darl/foo directory=dp dumpfile=inv.dmp transform=segment_attributes:n

You can remove just the storage clause by using STORAGE:n:

$ impdp darl/foo directory=dp dumpfile=inv.dmp transform=storage:n

Importing into a Different Tablespace from the Original

Sometimes you're exporting out of one database and want to import objects into another database, but you want to change the tablespaces for the tables and indexes—in other words, create the objects in the destination database but in tablespaces different than the source database configuration.

This example remaps the user as well as the tablespaces. The original user and tablespaces are HEERA and TBSP1. This command imports the TICKET table into the CHAYA user and the V_DATA tablespace:

$ impdp darl/foo directory=dp dumpfile=rm.dmp remap_schema=HEERA:CHAYA
remap_tablespace=TBSP1:V_DATA tables=heera.ticket

The REMAP_TABLESPACE feature doesn't re-create tablespaces. It only instructs Data Pump to place objects in different tablespaces (from where they were originally exported). When importing, if the tablespace that you're placing the object in doesn't exist, Data Pump throws an error.

Changing the Size of Datafiles

You can change the size of the datafiles when importing by using the TRANSFORM parameter with the PCTSPACE option. For example, if you want the tablespaces to be created at 20 percent of the original size, specify the following:

userid=darl/foo
directory=dp
dumpfile=phredstg.dmp
full=y
include=tablespace:"like 'TBSP%'"
transform=pctspace:20

The tablespaces are created with datafiles 20 percent of their original size. The extent allocation sizes are also 20 percent of their original definition. This feature is useful when used to export production data and then import it into a smaller database. In these scenarios, you may be filtering out some of the production data via the SAMPLE parameter or QUERY parameters (discussed in the next section).

Filtering Data and Objects

Data Pump has a vast array of mechanisms to filter data and metadata. You can influence what is excluded or included in a Data Pump export or import in the following ways:

  • Use the QUERY parameter to export or import subsets of data.
  • Use the SAMPLE parameter to export a percentage of the rows in a table.
  • Use the CONTENT parameter to exclude or include data and metadata.
  • Use the EXCLUDE parameter to specifically name items to be excluded.
  • Use the INCLUDE parameter to name the items to be included (thereby excluding other non-dependent items not included in the list).
  • Use parameters like SCHEMA to specify that you only want a subset of the database's objects (those that belong to the specified user or users).

Examples of each of these techniques are described in the following subsections.

images Note You can't use EXCLUDE and INCLUDE at the same time. These parameters are mutually exclusive.

Specifying a Query

You can use the QUERY parameter to instruct Data Pump to write to a dump file only rows that meet a certain criterion. You may want to do this if you're re-creating a test environment and only need subsets of the data. Keep in mind that this technique is unaware of any foreign-key constraints that may be in place, so you can't blindly restrict the data sets without considering parent/child relationships.

It has this general syntax for including a query:

QUERY = [schema.][table_name:] query_clause

The query clause can be any valid SQL clause. The query must be enclosed by either double quotes or single quotes. I recommend using double quotes because you may need single quotes embedded in the query to handle VARCHAR data. Also, you should use a parameter file so that there is no confusion about how the operating system interprets the quotation marks.

This example uses a parameter file and limits the rows exported for two tables. Here's the parameter file used when exporting:

userid=darl/foo
directory=dp
dumpfile=inv.dmp
tables=inv,reg
query=inv:"WHERE inv_desc='Book'"
query=reg:"WHERE reg_id <=20"

Say you place the previous lines of code in a file named inv.par. The export job references the parameter file as shown:

$ expdp parfile=inv.par

The resulting dump file only contains rows filtered by the QUERY parameters. Again, be mindful of any parent/child relationships, and ensure that what gets exported won't violate any constraints on the import.

You can also specify a query when importing data. Here's a parameter file that limits the rows imported into the INV table based on the INV_ID column:

userid=darl/foo
directory=dp
dumpfile=inv.dmp
tables=inv,reg
query=inv:"WHERE inv_id > 10"

The previous text is placed in a file named inv2.par and is referenced during the import as follows:

$ impdp parfile=inv2.par

Only the rows in the INV table that are filtered via the query are imported.

Exporting a Percentage of the Data

When exporting, the SAMPLE parameter instructs Data Pump to retrieve a certain percentage of rows based on a number you provide. Data Pump doesn't keep track of parent/child relationships when exporting. Therefore, this approach doesn't work well when you have tables linked via foreign-key constraints and you're trying to randomly select a percentage of rows.

Here's the general syntax for this parameter:

SAMPLE=[[schema_name.]table_name:]sample_percent

For example, if you want to export 10 percent of the data in a table, do so as follows:

$ expdp darl/foo directory=dp tables=inv sample=10 dumpfile=inv.dmp

This next example exports two tables, but only 30 percent of the REG table's data:

$ expdp darl/foo directory=dp tables=inv,reg sample=reg:30 dumpfile=inv.dmp

images Note The SAMPLE parameter is only valid for exports.

Excluding Objects from the Export File

For export, the EXLUDE parameter instructs Data Pump to not export specified objects (whereas the INCLUDE parameter instructs Data Pump to only include specific objects in the export file). The EXCLUDE parameter has this general syntax:

EXCLUDE=object_type[:name_clause] [, ...]

The OBJECT_TYPE refers to a database object like TABLE or INDEX. To see which object types can be filtered, view the OBJECT_PATH column of DATABASE_EXPORT_OBJECTS, SCHEMA_EXPORT_OBJECTS, or TABLE_EXPORT_OBJECTS. For example, if you want to view what schema-level objects can be filtered, run this query:

SELECT
 object_path
FROM schema_export_objects
WHERE object_path NOT LIKE '%/%';

Here's a small snippet of the output:

OBJECT_PATH
------------------
STATISTICS
SYNONYM
SYSTEM_GRANT
TABLE
TABLESPACE_QUOTA
TRIGGER

The EXCLUDE parameter instructs Data Pump export to filter out specific objects from the export. For example, say you're exporting a table but want to exclude the indexes and grants:

$ expdp darl/foo directory=dp dumpfile=inv.dmp tables=inv exclude=index,grant

You can filter at a more granular level by using NAME_CLAUSE. The NAME_CLAUSE option of EXCLUDE allows you to specify a SQL filter. To exclude indexes that have names that start with the string “INV”, you use the following:

exclude=index:"LIKE 'INV%'"

The previous line requires that you use quotation marks; in these scenarios, I recommend that you use a parameter file. This is because when you filter by the name of the object, it uses quotation marks. Sometimes it's hard to predict how the operating system will interpret quotation marks on the command line. Here's a parameter file that contains an EXCLUDE clause:

userid=darl/foo
directory=dp
dumpfile=inv.dmp
tables=inv
exclude=index:"LIKE 'INV%'"

A few aspects of the EXCLUDE clause may seem counterintuitive. For example, consider the following export parameter file:

userid=darl/foo
directory=dp
dumpfile=sch.dmp
exclude=schema:"='HEERA'"

If you attempt to exclude a user in this manner, an error is thrown. This is because the default mode of export is SCHEMA level, and Data Pump can't exclude and include a schema at the same time. If you want to exclude a user from an export file, specify the FULL mode and exclude the user:

userid=darl/foo
directory=dp
dumpfile=sch.dmp
exclude=schema:"='HEERA'"
full=y

Excluding Statistics

If you want to exclude statistics from your export job, you can do so using the EXCLUDE parameter. Here's an example:

$ expdp darl/foo directory=dp dumpfile=invp.dmp tables=f_sales exclude=statistics

By default, when you export a table object, any statistics are also exported. You can prevent statistics from being imported via the EXCLUDE parameter:

$ impdp darl/foo directory=dp dumpfile=invp.dmp tables=f_sales exclude=statistics

When importing, if you attempt to exclude statistics from a dump file that didn't originally include the statistics, then you receive this error:

ORA-39168: Object path STATISTICS was not found.

You also receive this error if the objects in the exported dump file never had statistics generated for them.

Including Only Specific Objects in an Export File

Use the INCLUDE parameter to include only certain database objects in the export file. The following example only exports the procedures and functions that a user owns:

$ expdp darl/foo dumpfile=proc.dmp directory=datapump include=procedure,function

The proc.dmp file that is created only contains the DDL required to re-create any procedures and functions the user owns.

When using INCLUDE, you can also specify that only specific PL/SQL objects should be exported:

$ expdp darl/foo directory=datapump dumpfile=ss.dmp
include=function:"='IS_DATE'"

When you're only exporting specific PL/SQL objects, because of the issues with having to escape quote marks on the operating-system command line, I recommend using a parameter file. When you use a parameter file, you don't have escape the quote marks. The following shows the contents of a parameter file that exports specific objects:

directory=datapump
dumpfile=pl.dmp
include=function:"='ISDATE'",procedure:"='DEPTREE_FILL'"

If you specify an object that doesn't exist, Data Pump throws an error but continues with the export operation:

ORA-39168: Object path FUNCTION was not found.

Exporting Table, Index, Constraint, and Trigger DDL

Suppose you want to export the DDL associated with tables, indexes, constraints, and triggers in your database. To do this, use the FULL export mode and only include tables:

$ expdp darl/foo directory=dp dumpfile=phredstg.dmp content=metadata_only full=y
include=table

When you export an object, Data Pump also exports any dependent objects. So, when you export a table, you also get indexes, constraints, and triggers associated with the table.

Excluding Objects from Import

In general, you can use the same techniques used to filter objects in exports to exclude objects from being imported. Use the EXCLUDE parameter to exclude objects from being imported. For example, to exclude triggers and procedures from being imported:

$ impdp darl/foo dumpfile=d.dmp directory=dp exclude=TRIGGER,PROCEDURE

You can further refine what is excluded by adding a SQL clause. For example, say you want to not import triggers that begin with the letter B. Here's what the parameter file looks like:

userid=darl/foo
directory=dp
dumpfile=h.dmp
schemas=HEERA
exclude=trigger:"like 'B%'"

Including Objects in Import

You can use the INCLUDE parameter to reduce what is imported. Suppose you have a schema from which you want to import tables that begin with the letter A. Here's the parameter file:

userid=darl/foo
directory=dp
dumpfile=h.dmp
schemas=HEERA
include=table:"like 'A%'"

If you place the previous text in a file name h.par, then the parameter file can be invoked as follows:

$ impdp parfile=h.par

In this example, the HEERA schema must already exist. Only tables that start with the letter A are imported.

Common Data Pump Tasks

This section contains common features you can use with Data Pump. Many of these features are standard parts of Data Pump, such as creating a consistent export or taking action when imported objects already exist in the database. Other features, such as compression and encryption, require the Enterprise Edition of Oracle and/or an extra license. I'll point out these requirements (if relevant) for the Data Pump element being described.

Creating a Consistent Export

A consistent export means that all data in the export file is consistent as of a time or an system change number (SCN). When you're exporting an active database with many parent-child tables, then you should ensure that you get a consistent snapshot of the data. You create a consistent export by using either the FLASHBACK_SCN or FLASHBACK_TIME parameter.

This example uses the FLASHBACK_SCN parameter to take an export. To determine the current value of the SCN of your dataset, issue this query:

SQL> select current_scn from v$database;

Here's some typical output:

     CURRENT_SCN
----------------
   8400741902387

Next, export using the FLASHBACK_SCN parameter:

$ expdp darl/foo directory=dp flashback_scn=8400741902387

The previous export command ensures that all data exported is consistent with any committed transactions in the database as of SCN 8400741902387.

When you use the FLASHBACK_SCN parameter, Data Pump ensures that the data in the export file is consistent as of the specified SCN. This means any transactions committed after the specified SCN aren't included in the export file.

images Note If you use the NETWORK_LINK parameter in conjunction with FLASHBACK_SCN, then the export is taken with the SCN consistent with the database referenced in the database link.

You can also use FLASHBACK_TIME to specify that the export file should be created with consistent committed transactions as of a specified time. When using FLASHBACK_TIME, Oracle determines the SCN that most closely matches the time specified and uses that to produce an export consistent with that SCN. The syntax for using FLASHBACK_TIME is as follows:

FLASHBACK_TIME="TO_TIMESTAMP{<value>}"

For some operating systems, double quotes directly on the command line must be escaped by a backslash character (), because the operating system treats the double quotes as special characters. Due to issues regarding how operating systems treat quote marks, it's much more straightforward to use a parameter file. Here are the contents of a parameter file that uses FLASHBACK_TIME:

directory=datapump
content=metadata_only
dumpfile=exp.dmp
flashback_time="to_timestamp('26-oct-2009 07:03:00','dd-mon-yyyy hh24:mi:ss')"

Depending on your operating system, the command-line version of the previous example must be specified as follows:

flashback_time="to_timestamp('26-oct-2009 07:03:00',
'dd-mon-yyyy hh24:mi:ss')"

This line of code should be specified on one line. It's wrapped on two lines in this book to fit on the page.

You can't specify both FLASHBACK_SCN and FLASHBACK_TIME when taking an export; these two parameters are mutually exclusive. If you attempt to use both parameters at the same time, Data Pump throws the following error message and halts the export job:

ORA-39050: parameter FLASHBACK_TIME is incompatible with parameter FLASHBACK_SCN

images Note The FLASHBACK_SCN and FLASHBACK_TIME parameters are only applicable to the Oracle flashback query functionality. These parameters aren't applicable to flashback database or flashback drop.

Importing When Objects Already Exist

When export and importing data, often you import into a schema that already has the objects created (tables, indexes, and so on). In this situation, you should import the data but instruct Data Pump to not try to create already-existing objects.

You can achieve this with the TABLE_EXISTS_ACTION and CONTENT parameters. The next example instructs Data Pump to append data in any tables that already exist via the TABLE_EXISTS_ACTION=APPEND option. Also used is the CONTENT=DATA_ONLY option, which instructs Data Pump to not run any DDL to create objects (only load data):

$ impdp darl/foo directory=dk dumpfile=inv.dmp table_exists_action=append
content=data_only

Existing objects aren't modified in any way, and any new data that exists in the dump file is inserted into any tables.

You may wonder what happens if you just use the TABLE_EXISTS_ACTION option and don't combine it with the CONTENT:

$ impdp darl/foo directory=dk dumpfile=inv.dmp table_exists_action=append

The only difference is that Data Pump attempts to run DDL commands to create objects if they exist. It doesn't stop the job from running, but you see an error message in the output indicating that the object already exists. Here's a snippet of the output for the previous command:

ORA-39152: Table "INV_MGMT"."INV" exists. Data will be appended to existing table
but all dependent metadata will be skipped due to table_exists_action of append

The default for the TABLE_EXISTS_ACTION parameter is SKIP, unless you also specify the parameter of CONTENT=DATA_ONLY. If you use CONTENT=DATA_ONLY, then the default for TABLE_EXISTS_ACTION is APPEND.

The TABLE_EXISTS_ACTION parameter takes the following options:

  • SKIP (default if not combined with CONTENT=DATA_ONLY)
  • APPEND (default if combined with CONTENT=DATA_ONLY)
  • REPLACE
  • TRUNCATE

The SKIP option tells Data Pump to not process the object if it exists. The APPEND option instructs Data Pump to not delete existing data, but rather to add data to the table without modifying any existing data. The REPLACE option instructs Data Pump to drop and re-create objects; this parameter isn't valid when the CONTENT parameter is used with the DATA_ONLY option. The TRUNCATE parameter instructs Data Pump to delete rows from tables via a TRUNCATE statement.

The CONTENT parameter takes the following parameter options:

  • ALL (default)
  • DATA_ONLY
  • METADATA_ONLY

The ALL option instructs Data Pump to load both data and metadata contained in the dump file; this is the default behavior. The DATA_ONLY option instructs Data Pump to load only table data into existing tables; no database objects are created. The METADATA_ONLY option only creates objects; no data is loaded.

images Note With the old imp utility, you can instruct the import process to ignore existing objects via the ignore=y option.

Renaming a Table

Starting with Oracle Database 11g, you have the option of renaming a table during import operations. There are many reasons you may want to rename a table when importing it. For example, you may have a table in the target schema that has the same name as the table you want to import. You can rename a table when importing by using the REMAP_TABLE parameter. This example imports the table from the HEERA user INV table to the HEERA user INVEN table:

$ impdp darl/foo directory=dk dumpfile=inv.dmp tables=heera.inv
remap_table=heera.inv:inven

Here's the general syntax for renaming a table:

REMAP_TABLE=[schema.]old_tablename[.partition]:new_tablename

or

REMAP_TABLE=[schema.]old_tablename[:partition]:new_tablename

Notice that this syntax doesn't allow you to rename a table into a different schema. If you're not careful, you may attempt to do the following (thinking that you're moving a table and renaming it in one operation):

$ impdp darl/foo directory=dk dumpfile=inv.dmp tables=heera.inv
remap_table=heera.inv:scott.inven

You end up with a table in the HEERA schema named SCOTT.INVEN. That can be confusing. If you want to import a table into a different schema and rename it at the same time, use REMAP_SCHEMA with the REMAP_TABLE parameters:

$ impdp darl/foo directory=dk dumpfile=inv.dmp remap_schema=heera:darl
tables=heera.inv remap_table=heera.inv:invent

images Note The process of renaming a table wasn't entirely bug-free in Oracle Database 11g release 1. It's been corrected in Oracle Database 11g release 2. See My Oracle Support Note 886762.1 for more details.

Remapping Data

Starting with Oracle Database 11g, when either exporting or importing, you can apply a PL/SQL function to alter a column value. For example, you may have an auditor who needs to look at the data, and one requirement is that you apply a simple obfuscation function to sensitive columns. The data doesn't need to be encrypted, it just needs to be changed enough that the auditor can't readily determine the value of the LAST_NAME column in the CUSTOMERS table.

This example first creates a simple package that is used to obfuscate the data:

create or replace package obfus is
  function obf(clear_string varchar2) return varchar2;
  function unobf(obs_string varchar2) return varchar2;
end obfus;
/
--
create or replace package body obfus is
  fromstr varchar2(62) := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ' ||
             'abcdefghijklmnopqrstuvwxyz';
  tostr varchar2(62)   := 'defghijklmnopqrstuvwxyzabc3456789012' ||
             'KLMNOPQRSTUVWXYZABCDEFGHIJ';
--
function obf(clear_string varchar2) return varchar2 is
begin
  return translate(clear_string, fromstr, tostr);
end obf;
--
function unobf(obs_string varchar2) return varchar2 is
begin
  return translate(obs_string, tostr, fromstr);
end unobf;
end obfus;
/

Now, when you import the data into the database, you apply the obfuscation function to the LAST_NAME column of the CUSTOMERS table:

$ impdp darl/foo directory=dp dumpfile=cust.dmp tables=customers  
remap_data=customers.last_name:obfus.obf

Selecting LAST_NAME from CUSTOMERS shows that it has been imported in an obfuscated manner:

SQL> select last_name from customers;
LAST_NAME
------------------
yYZEJ
tOXXSMU
xERX

You can manually apply the package's UNOBF function to see the real values of the column:

SQL> select obfus.unobf(last_name) from customers;
OBFUS.UNOBF(LAST_NAME)
-------------------------
Lopuz
Gennick
Kuhn

Cloning a User

Suppose you need to move a user's objects and data to a new database. As part of the migration, you want to rename the user. First, create an export file that contains the user you want to clone. In this example, the user name is INV:

$ expdp darl/foo directory=dp schemas=inv dumpfile=inv.dmp

Now you can use Data Pump import to clone the user. You can do this in the database from which the user was exported or in a different database. Use the REMAP_SCHEMA parameter to create a copy of a user. In this example, the INV user is cloned to the INV_DW user:

$ impdp darl/foo directory=dp remap_schema=inv:inv_dw dumpfile=inv.dmp

This command copies all structures and data in the INV user to the INV_DW user. The resulting INV_DW user is identical in terms of objects to the INV user. The duplicated schema also contains the same password as the schema from which it was copied.

If you just want to duplicate the metadata from one schema to another, use the CONTENT parameter with the METADATA_ONLY option:

$ impdp darl/foo remap_schema=inv:inv_dw content=metadata_only dumpfile=inv.dmp

The REMAP_SCHEMA parameter provides an efficient way to duplicate a schema with or without the data. During a schema duplication operation, if you want to change the tablespace in which the objects reside, also use the REMAP_TABLESPACE parameter. This allows you to duplicate a schema and also place the objects in a different tablespace than the source objects' tablespaces.

You can also duplicate a user from one database to another without first creating a dump file. To do this, use the NETWORK_LINK parameter. See the prior section in this chapter on Exporting and Importing Directly Across the Network for details on copying data directly from one database to another.

Suppressing a Log File

By default, Data Pump creates a log file when generating an export or an import. If you know that you don't want a log file generated, you can suppress it by specifying the NOLOGFILE parameter. Here's an example:

$ expdp heera/foo directory=dk tables=inv nologfile=y

If you choose to not create a log file, Data Pump still displays status messages on the output device. In general, I recommend that you create a log file with every Data Pump operation. This gives you an audit trail of your actions.

Using Parallelism

Use the PARALLEL parameter to parallelize a Data Pump job. For example, if you know you have four CPUs on a box, and you want to set the degree of parallelism to 4, use PARALLEL as follows:

$ expdp darl/foo parallel=4 dumpfile=exp.dmp directory=datapump full=y

To take full advantage of the parallel feature, ensure that you specify multiple files when exporting. The following example creates one file for each thread of parallelism:

$ expdp darl/foo parallel=4 dumpfile=exp1.dmp,exp2.dmp,exp3.dmp,exp4.dmp

You can also use the %U substitution variable to instruct Data Pump to automatically create dump files to match the degree of parallelism. The %U variable starts at the value 01 and increments as additional dump files are allocated. This example uses the %U variable:

$ expdp darl/foo parallel=4 dumpfile=exp%U.dmp

Now, say you need to import from the dump files created from an export. You can either individually specify the dump files or, if the dump files were created with the %U variable, use that on import:

$ impdp darl/foo parallel=4 dumpfile=exp%U.dmp

When using the %    U substitution variable, in this example the import process starts by looking for a file with the name of exp01.dmp, then exp02.dmp, and so on.

images Tip Oracle recommends that the degree of parallelism not be set to more than two times the number of CPUs available on the server.

You can also modify the degree of parallelism while the job is running. First, attach in the interactive command mode to the job for which you want to modify the degree of parallelism. Then, use the PARALLEL option. In this example, the job attached to is SYS_IMPORT_TABLE_01:

$ impdp darl/foo attach=sys_import_table_01
Import> parallel=6

You can check the degree of parallelism via the STATUS command:

Import> status

Here's some sample output:

Job: SYS_IMPORT_TABLE_01                                                         
  Operation: IMPORT                                                              
  Mode: TABLE                                                                    
  State: EXECUTING                                                               
  Bytes Processed: 0                                                             
  Current Parallelism: 6

images Note The PARALLEL feature is only available in the Enterprise Edition of the Oracle Database.

Specifying Additional Dump Files

If you run out of space in the primary dump-file location, then you can specify additional dump-file locations on the fly. Use the ADD_FILE command from the interactive command prompt. Here's the basic syntax for adding additional files:

ADD_FILE=[directory_object:]file_name [,...]

This example adds another output file to an already existing Data Pump export job:

Export> add_file=alt2.dmp

You can also specify a separate database-directory object:

Export> add_file=alt_dir:alt3.dmp

Reusing Output File Names

By default, Data Pump doesn't overwrite an existing dump file. For example, the first time you run this job, it will run fine because there is no dump file named inv.dmp in the directory being used:

$ expdp heera/foo directory=dk dumpfile=inv.dmp

If you attempt to run the previous command again with the same directory and the same dump-file name, this error is thrown:

ORA-39000: bad dump file specification
ORA-31641: unable to create dump file "/home/oracle/dp/inv.dmp"
ORA-27038: created file already exists

You can either specify a new dump-file name for the export job or use the REUSE_DUMPFILES parameter to direct Data Pump to overwrite an existing dump file. For example:

$ expdp heera/foo directory=dk dumpfile=inv.dmp reuse_dumpfiles=y

You should now be able to run the Data Pump export regardless of an existing dump file in the output directory with the same name. When you set REUSE_DUMPFILES to a value of y, if Data Pump finds a dump file with the same name, it overwrites the file.

images Note The default value for REUSE_DUMPFILES is n. The REUSE_DUMPFILES parameter is available only in Oracle Database 11g or higher.

Creating a Daily DDL File

Sometimes, in database environments, changes occur to database objects in unexpected ways. You may have a developer who somehow obtains the production user passwords and decides to make a change on the fly without telling anybody. Or a DBA may decide not to follow the standard release process and make a change to an object while troubleshooting an issue. These scenarios can be frustrating for production-support DBAs. Whenever there's an issue, the first question raised is, “What changed?”

When you use Data Pump, it's fairly simple to create a file that contains all the DDL to re-create every object in your database. You can instruct Data Pump to export or import just the metadata via the CONTENT=METADATA_ONLY option.

For instance, in a production environment you can set up a daily job to capture this DDL. If there is ever a question about what changed and when, you can go back and compare the DDL in the daily dump files to see exactly what changed and when.

Listed next is a simple shell script that first exports the metadata content from the database and then uses Data Pump import to create a DDL file from that export:

#!/bin/bash
# source OS variables
. /var/opt/oracle/oraset DWREP
#
DAY=$(date +%Y_%m_%d)
SID=DWREP
#---------------------------------------------------
# First create export dump file with metadata only
expdp darl/foo dumpfile=${SID}.${DAY}.dmp content=metadata_only
directory=dwrep_dp full=y logfile=${SID}.${DAY}.log
#---------------------------------------------------
# Now create DDL file from the export dump file.
impdp darl/foo directory=dwrep_dp dumpfile=${SID}.${DAY}.dmp
SQLFILE=${SID}.${DAY}.sql logfile=${SID}.${DAY}.sql.log
#
exit 0

This code listing depends on a database-directory object being created that points to where you want the daily dump file to be written. You may also want to set up another job that periodically deletes any files older than a certain amount of time.

Compressing Output

When you use Data Pump to create large files, you should consider compressing the output. As of Oracle Database 11g, the COMPRESSION parameter can be one of the following values: ALL, DATA_ONLY, METADATA_ONLY, or NONE. If you specify ALL, then both data and metadata are compressed in the output. This example exports one table and compresses both the data and metadata in the output file:

$ expdp dbauser/foo tables=locations directory=datapump
dumpfile=compress.dmp compression=all

If you're using Oracle Database 10g, then the COMPRESSION parameter only has the METADATA_ONLY and NONE values.

images Note The ALL and DATA_ONLY options of the COMPRESS parameter require a license for the Oracle Advanced Compression option.

Encrypting Data

One potential security issue with Data Pump dump files is that anybody with operating-system access to the output file can search for strings in the file. On Linux/Unix systems, you can do this with the strings command:

$ strings inv.dmp | grep -i secret

Here's the output for this particular dump file:

Secret Data<
top secret data<
corporate secret data<

This command allows you to view the contents of the dump file because the data is in regular text and not encrypted. If you require that the data be secured, you can use Data Pump's encryption features.

Data Pump lets you easily encrypt the output of a dump file. This example uses the ENCRYPTION parameter to secure all data and metadata in the output:

$ expdp darl/foo encryption=all directory=dp dumpfile=inv.dmp

For this command to work, your database must have an encryption wallet in place and open. See the Oracle Advanced Security Guide (available on OTN) for more details about how to create and open a wallet.

images Note The Data Pump ENCRYPTION parameter requires that you use the Enterprise Edition of Oracle Database 11g or higher and also requires a license for the Oracle Advanced Security option.

The ENCRYPTION parameter takes the following options:

  • ALL
  • DATA_ONLY
  • ENCRYPTED_COLUMNS_ONLY
  • METADATA_ONLY
  • NONE

The ALL option enables encryption for both data and metadata. The DATA_ONLY option encrypts just the data. The ENCRYPTED_COLUMNS_ONLY option specifies that only columns encrypted in the database are written to the dump file in an encrypted format. The METADATA_ONLY option encrypts just metadata in the export file.

Monitoring Data Pump Jobs

When you have long-running Data Pump jobs, you should occasionally check the status of the job to make sure the job hasn't failed, or it's become suspended for some reason, and so on. There are several ways to monitor the status of Data Pump jobs:

  • Screen output
  • Data Pump log file
  • Database alert log
  • Querying the status table
  • Querying data-dictionary views
  • Interactive command-mode status
  • Using operating-system utilities' process status ps

The most obvious way to monitor a job is to view the status that Data Pump displays on the screen as the job is running. If you've disconnected from the command mode, then the status is no longer displayed on your screen. In this situation, you must use another technique to monitor a Data Pump job.

Data Pump Log File

By default, Data Pump generates a log file for every job. When you start a Data Pump job, it's a good practice to name a log file that is specific for that job:

$ impdp darl/foo directory=dp dumpfile=archive.dmp logfile=archive.log

This job creates a file named archive.log that is placed in the directory referenced in the database object DP. If you don't explicitly name a log file, Data Pump import creates one named import.log, and Data Pump export creates one named export.log.

images Note The log file contains the same information you see displayed interactively on your screen when running a Data Pump job.

Data-Dictionary Views

A quick way to determine whether a Data Pump job is running is to check the DBA_DATAPUMP_JOBS view for anything running with a STATE that has an EXECUTING status:

select
 job_name
,operation
,job_mode
,state
from dba_datapump_jobs;

Here's some sample output:

JOB_NAME                  OPERATION            JOB_MODE   STATE
------------------------- -------------------- ---------- ---------------
SYS_IMPORT_TABLE_04       IMPORT               TABLE      EXECUTING
SYS_IMPORT_FULL_02        IMPORT               FULL       NOT RUNNING

You can also query the DBA_DATAPUMP_SESSIONS view for session information via the following query:

select
 sid
,serial#
,username
,process
,program
from v$session s,
     dba_datapump_sessions d
where s.saddr = d.saddr;

Here's some sample output showing that several Data Pump sessions are in use:

      SID    SERIAL# USERNAME             PROCESS         PROGRAM
---------- ---------- -------------------- --------------- ----------------------
      1049       6451 STAGING              11306           oracle@xengdb (DM00)
      1058      33126 STAGING              11338           oracle@xengdb (DW01)
      1048      50508 STAGING              11396           oracle@xengdb (DW02)

Database Alert Log

If a job is taking much longer than you expected, look in the database alert log for any messages similar to this:

statement in resumable session 'SYS_IMPORT_SCHEMA_02.1' was suspended due to
ORA-01652: unable to extend temp segment by 64 in tablespace REG_TBSP_3

This indicates that a Data Pump import job is suspended and is waiting for space to be added to the REG_TBSP_3 tablespace. After you add space to the tablespace, the Data Pump job automatically resumes processing. By default, a Data Pump job waits two hours for space to be added.

images Note In addition to writing to the alert log, for each Data Pump job, Oracle creates a trace file in the ADR_HOME/trace directory. This file contains information such as the session ID and when the job started. The trace file is named with the following format: <SID>_dm00_<process_ID>.trc.

Status Table

Every time you start a Data Pump job, a status table is automatically created in the account of the user running the Data Pump job. For export jobs, the table name depends on what type of export job you're running. The table is named with the format SYS_<OPERATION>_<JOB_MODE>_NN, where OPERATION is either EXPORT or IMPORT. JOB_MODE can be FULL, SCHEMA, TABLE, TABLESPACE, and so on.

Here's an example of querying the status table for particulars about a currently running job:

select
 name
,object_name
,total_bytes/1024/1024 t_m_bytes
,job_mode
,state
,to_char(last_update, 'dd-mon-yy hh24:mi')
from SYS_IMPORT_TABLE_04
where state='EXECUTING';

Interactive Command-Mode Status

A quick way to verify that Data Pump is running a job is to attach in interactive command mode and issue a STATUS command. For example:

$ impdp staging/staging_xst attach=SYS_IMPORT_TABLE_04
Import> status

Here's some sample output:

Job: SYS_IMPORT_TABLE_04
  Operation: IMPORT
  Mode: TABLE
  State: EXECUTING
  Bytes Processed: 0
  Current Parallelism: 4

You should see a state of EXECUTING, which indicates that the job is actively running. Other items to inspect in the output are the number of objects processed and bytes processed. Those numbers should increase as the job progresses.

Operating-System Utilities

You can use the process status (ps) operating-system utility to display jobs running on the server. For example, you can search for master and worker processes as follows:

$ ps -ef | egrep 'ora_dm|ora_dw' | grep -v egrep

Here's some sample output:

  oracle 29871   717   5 08:26:39 ?          11:42 ora_dw01_STAGE
  oracle 29848   717   0 08:26:33 ?           0:08 ora_dm00_STAGE
  oracle 29979   717   0 08:27:09 ?           0:04 ora_dw02_STAGE

If you run this command multiple times, you should see the processing time (seventh column) increase for one or more of the jobs running. This is a good indicator that Data Pump is still executing and doing work.

Data Pump Legacy Mode

This feature is covered last in this chapter, but it's quite useful especially if you're an old-school DBA. As of Oracle Database 11g release 2, Data Pump allows you to use the old exp and imp utility parameters. This is known as legacy mode. You don't have to do anything special to use legacy mode Data Pump. As soon as Data Pump detects a legacy parameter, it attempts to process the parameter as if it were from the old exp/imp utilities. You can even mix and match old legacy parameters with newer parameters. For example:

$ expdp darl/foo consistent=y tables=inv directory=dk

In the output, Data Pump indicates that it has encountered legacy parameters and gives you the syntax for what it translated the legacy parameter to in Data Pump syntax. For the previous command, here's the output from the Data Pump session that shows what the consistent=y parameter was translated into:

Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "consistent=TRUE" Location: Command Line, Replaced with:
"flashback_time=TO_TIMESTAMP('2010-07-01 08:10:20', 'YYYY-MM-DD HH24:MI:SS')"

This feature can be extremely handy, especially if you're really familiar with the old legacy syntax and wonder how it's implemented in Data Pump.

I recommend that you try to use the newer Data Pump syntax whenever possible. However, you may run into situations where you have legacy exp/imp jobs and want to continue running the scripts as they are without any modifications.

images Note When Data Pump runs in legacy mode, it doesn't create an old exp/imp formatted file. Data Pump always creates a Data Pump file and can only read Data Pump files.

Data Pump Mapping to the exp Utility

If you're used to the old exp/imp parameters, you may initially be confused by some of the syntax semantics. However, after you use Data Pump, you'll find the newer syntax fairly easy to remember and use. Table 13–3 describes how the legacy export parameters map to Data Pump export.

In many instances, there isn't a one-to-one mapping. Often, Data Pump automatically provides features that used to require a parameter in the legacy utilities. For example, whereas you used to have to specify DIRECT=Y to get a direct path export, Data Pump automatically uses direct path whenever possible.

Table 13–3. Mapping of Old Export Parameters to Data Pump

Original exp Parameter Similar Data Pump expdp Parameter
BUFFER N/A
COMPRESS TRANSFORM
CONSISTENT FLASHBACK_SCN or FLASHBACK_TIME
CONSTRAINTS EXCLUDE=CONSTRAINTS
DIRECT N/A; Data Pump automatically uses direct path when possible
FEEDBACK STATUS in client output
FILE Database-directory object and DUMPFILE
GRANTS EXCLUDE=GRANT
INDEXES INCLUDE=INDEXES, INCLUDE=INDEXES
LOG Database-directory object and LOGFILE
OBJECT_CONSISTENT N/A
OWNER SCHEMAS
RECORDLENGTH N/A
RESUMABLE N/A; Data Pump automatically provides functionality
RESUMABLE_NAME N/A
RESUMABLE_TIMEOUT N/A
ROWS CONTENT=ALL
STATISTICS N/A; Data Pump export always exports statistics for tables
TABLESPACES TRANSPORT_TABLESPACES
TRANSPORT_TABLESPACE TRANSPORT_TABLESPACES
TRIGGERS EXCLUDE=TRIGGER
TTS_FULL_CHECK TRANSPORT_FULL_CHECK
VOLSIZE N/A; Data Pump doesn't support tape devices

Data Pump Mapping to the imp Utility

As with Data Pump export, Data Pump import often doesn't have a one-to-one mapping of the legacy utility parameter. Data Pump import automatically provides many features of the old imp utility. For example, COMMIT=Y isn't required because Data Pump import automatically commits after each table is imported. Table 13–4 describes how legacy import parameters map to Data Pump import.

Table 13–4. Mapping of Old Import Parameters to Data Pump

Original imp Parameter Similar Data Pump impdp Parameter
BUFFER N/A
CHARSET N/A
COMMIT N/A; Data Pump import automatically commits after each table is exported
COMPILE N/A; Data Pump import compiles procedures after they're created
CONSTRAINTS EXCLUDE=CONSTRAINT
DATAFILES TRANSPORT_DATAFILES
DESTROY REUSE_DATAFILES=y
FEEDBACK STATUS in client output
FILE Database-directory object and DUMPFILE
FILESIZE N/A
FROMUSER REMAP_SCHEMA
GRANTS EXCLUDE=OBJECT_GRANT
IGNORE TABLE_EXISTS_ACTION with APPEND, REPLACE, SKIP, or TRUNCATE
INDEXES EXCLUDE=INDEXES
INDEXFILE SQLFILE
LOG Database-directory object and LOGFILE
RECORDLENGTH N/A
RESUMABLE N/A; this functionality is automatically provided
RESUMABLE_NAME N/A
RESUMABLE_TIMEOUT N/A
ROWS=N CONTENT, with METADATA_ONLY or ALL
SHOW SQLFILE
STATISTICS N/A
STREAMS_CONFIGURATION N/A
STREAMS_INSTANTIATION N/A
TABLESPACES TRANSPORT_TABLESPACES
TOID_NOVALIDATE N/A
TOUSER REMAP_SCHEMA
TRANSPORT_TABLESPACE TRANSPORT_TABLESPACES
TTS_OWNERS N/A
VOLSIZE N/A; Data Pump doesn't support tape devices

Summary

Data Pump is an extremely powerful and feature-rich tool. If you haven't used Data Pump much, then I implore you to take some time to read this chapter and work through the examples. This tool greatly simplifies tasks like moving users and data from one environment to another. You can export and import subsets of users, filter and remap data via SQL and PL/SQL, rename users and tablespaces, compress, encrypt, and parallelize, all with one command. It really is that powerful.

DBAs sometimes stick with the old exp/imp utilities because that's what they're familiar with (I'm occasionally guilty of this). If you're running Oracle Database 11g release 2, you can directly use the old exp/imp parameters and options directly from the command line. Data Pump translates these parameters on the fly to Data Pump–specific syntax. This feature nicely facilitates the migration from the old to the new. For reference, I've also provided a mapping of the old exp/imp syntax and how it relates to Data Pump commands.

Although Data Pump is an excellent tool for moving database objects and data from one environment to another, sometimes you need to transfer large quantities of data to and from operating-system flat files. You use external tables to achieve this task. This is the topic of the next chapter in this book.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.240.222