Chapter 12. interMedia Application Performance

Identify Performance Needs and Goals

One of the first things that is necessary when planning the performance of a system is to identify what the performance needs are. It is also desirable to identify what the performance goals are. It is difficult to reach goals when they do not exist.

Performance does not typically fall into one category, like response time. Performance may be an issue in many other areas of your media database application. Performance goals may include the following issues:

  • User response time

  • Time to backup the data

  • Time to restore operations after a failure

  • Amount of time to load data

Depending on your application, it is important to identify strategies to reach your performance goals. For example, a nightly full backup of the database may take too much time. A better strategy may be to do full backups on the weekend and incremental backups during the week.

Defining your performance goals will also allow you to invest your time in the most beneficial manner. Improving performance may entail:

  • Hardware changes

  • Infrastructure parameter changes (database, operating system, application server, etc.)

  • Applications changes

  • Process changes (backup methodology, when routine tasks are performed, etc.)

Tuning the Database

The goal of tuning the system global area (SGA) is to keep the correct amount of data in memory for optimal performance. Too small an SGA could mean excessive disk I/O activity to the database. Too large an SGA could lead to excessive paging and swapping of memory on your system. Since media data is larger than traditional relational data, it may be wise to consider increasing the physical memory so that your interMedia application can have a large SGA without increasing swapping and paging that can adversely affect the performance of your system. In general, increasing the default size of the SGA is necessary for interMedia applications to perform at their peak.

With Oracle 10g, distribution of SGA memory can be managed automatically. The commonly configured parameters include:

  • Shared pool (for PL/SQL and SQL execution)

  • Java pool (for Java execution state)

  • Large pool (for large allocations such as RMAN backup buffers)

  • Buffer cache

  • Streams pool

To turn on automatic shared memory management using Enterprise Manager, select the Management tab and then the Memory Parameters link. On the memory parameters page, there is a button to enable automatic shared memory management. For those that want to edit the initialization file to enable automatic shared memory management, the SGA_TARGET initialization parameter is set to a nonzero value and the STATISTICS_LEVEL initialization parameter is set to TYPICAL or ALL. Note if the initialization settings for the pool sizes are not zero, they are considered to be minimum values. These can be set to an initial minimum size for each pool necessary to run your application.

There are too many issues in general performance to cover here, but a good start to tune an Oracle database is to use the automated tuning features built into it. You can use the Automatic Workload Repository (AWR) to collect tuning information, and after a sample workload is performed or simulated, use the Automatic Database Diagnostic Monitor (ADDM) to analyze the information collected by the AWR for possible performance problems with an Oracle database.

One area of note in tuning an Oracle database for interMedia is the java_pool_size parameter. Since interMedia uses Java for much of its processing, this pool may need to be increased to process large media data.

To turn on AWR, log in as a sysdba user (like sys) into your database Enterprise Manager and select the Administration tab. This will give you the screen shown in Figure 12.1 with the Automatic Workload Repository link.

Oracle Enterprise Manager

Figure 12.1. Oracle Enterprise Manager

On selection of the Automatic Workload Repository link, the screen shown in Figure 12.2 will be displayed.

Automatic Workload Repository Configuration

Figure 12.2. Automatic Workload Repository Configuration

This screen indicates that database performance data is being collected with Automatic Workload Repository. The main settable parameters are explained below:

  • Snapshot Retention. The length of time performance records are retained. Records older than this section are automatically removed.

  • Snapshot Interval. The interval at which snapshot records are captured.

  • Collection Level. This parameter indicates the level at which records are captured. This parameter can be changed from the initial parameters page, which you can navigate to by clicking on the collection-level setting. This setting can be set to BASIC, TYPICAL, or ALL.

Once Automatic Workload Repository is active, records are saved at every interval. You can also create a snapshot at any time you like, perhaps before and then after a sample workload, with the following PL/SQL:

BEGIN DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT (); END;
 /

The retention period indicates how long to save the performance statistics. You will probably want the retention period to include your highest workload time. For example, if you have high system usage once per month at the end of the month, you will likely want to increase the retention period to one month.

Once you have statistics, you then can have them automatically analyzed by the Automatic Database Diagnostic Monitor (ADDM). This monitor will evaluate the statistics gathered by AWR and give specific plain text recommendations on improving performance.

The following is an example ADDM recommendation:

FINDING 3: 18% impact (396 seconds)
-----------------------------------
The SGA was inadequately sized, causing additional I/O or hard parses.

   RECOMMENDATION 1: DB Configuration, 7% benefit (157 seconds)
      ACTION: Increase the size of the SGA by setting the parameter
         "sga_target" to 345 M.

   ADDITIONAL INFORMATION:
      The value of parameter "sga_target" was "276 M" during the
analysis
      period.

   SYMPTOMS THAT LED TO THE FINDING:
      SYMPTOM: Wait class "User I/O" was consuming significant
database time.
               (21% impact [472 seconds])
      SYMPTOM: Hard parsing of SQL statements was consuming
significant
               database time. (17% impact [390 seconds])

Some ways to run ADDM reports using Enterprise Manager are:

  • On the database home page click on the ADDM Findings to see the current findings. To see the recommendations for the finding, click on the finding on the findings page.

  • On the database performance page, click on Run ADDM Nowto create a new snapshot and run ADDM on the previous to the new snapshot.

  • Click on Advisor Central under Related Links. Click on ADDM and select Run ADDM to analyze past instance performance. Select the beginning and ending snapshot period. Click OK and on the next page click View Report.

You can also create ADDM reports from PL/SQL. Since AWR is on by default, it is simple to create a report from PL/SQL if you are more comfortable working this way. To create a report, log into PL/SQL as a SYSDBA user and run the following PL/SQL script provided with Oracle:

SQL> @$ORACLE_HOME/rdbms/admin/addmrpt.sql

Current Instance
~~~~~~~~~~~~~~~~
   DB Id    DB Name      Inst Num Instance
----------- ------------ -------- ------------
  550525798 ORCL2               1 orcl2

Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

   DB Id     Inst Num DB Name      Instance     Host
------------ -------- ------------ ------------ ------------
* 550525798         1 ORCL2        orcl2        dbhost

Using  550525798 for database Id
Using          1 for instance number

Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed.  Pressing <return> without
specifying a number lists all completed snapshots.

Listing the last 3 days of Completed Snapshots
                                                         Snap
Instance     DB Name        Snap Id    Snap Started     Level
------------ ------------ --------- ------------------ -----
orcl2        ORCL2             3762 14 Mar 2006 00:00       1
                               3763 14 Mar 2006 01:00       1
                               3764 14 Mar 2006 02:00       1
                                :
                                :
                               3809 16 Mar 2006 08:00       1


Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 3762
Begin Snapshot Id specified: 3762

Enter value for end_snap: 3809
End   Snapshot Id specified: 3809

Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is addmrpt_1_3762_3809.txt.  To use this name,
press <return> to continue, otherwise enter an alternative.

Enter value for report_name:

Using the report name addmrpt_1_3762_3809.txt

Running the ADDM analysis on the specified pair of snapshots ...

Generating the ADDM report for this analysis ...

          DETAILED ADDM REPORT FOR TASK 'TASK_16104' WITH ID 16104
          --------------------------------------------------------
        Analysis Period: from 13-MAR-2006 11:41 to 15-MAR-2006 11:00
   Database ID/Instance: 550525798/1
Database/Instance Names: ORCL2/orcl2
              Host Name: dbhost
      Database Version: 10.2.0.1.0
        Snapshot Range: from 3762 to 3809
         Database Time: 2253 seconds
 Average Database Load: 0 active sessions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FINDING 1: 100% impact (2253 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

   RECOMMENDATION 1: Host Configuration, 100% benefit (2253 seconds)
      ACTION: Host operating system was experiencing significant paging but no
         particular root cause could be detected. Investigate processes that
         do not belong to this instance running on the host that are consuming
         significant amount of virtual memory. Also consider adding more
         physical memory to the host.

        :
        :
(More findings and recommendations)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

          ADDITIONAL INFORMATION
          ----------------------
Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.

The database's maintenance windows were active during 33% of the analysis
period.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 8000 micro-seconds.
An explanation of the terminology used in this report is available when you
run the report with the 'ALL' level of detail.

End of Report
Report written to addmrpt_1_3762_3809.txt

Note that ADDM will also give recommendations on other aspects of your application and the system you are running on. For example, if there is excessive swaping or paging it will recommend you try and find and eliminate programs that are not being used. Some example recommendations are:

FINDING 1: 100% impact (2253 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

   RECOMMENDATION 1: Host Configuration, 100% benefit (2253 seconds)
     ACTION: Host operating system was experiencing significant paging but no
        particular root cause could be detected. Investigate processes that
        do not belong to this instance running on the host that are consuming
        significant amount of virtual memory. Also consider adding more
        physical memory to the host.

Another recommendation may relate to the SQL code you are using (either directly or through a database connection). This recommendation may look like the following:

FINDING 7: 2% impact (157 seconds)
----------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

   RECOMMENDATION 1: SQL Tuning, 2% benefit (157 seconds)
      ACTION: Tune the PL/SQL block with SQL_ID "7vd7gcsk3wkzs". Refer to the
        "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
         and Reference"
         RELEVANT OBJECT: SQL statement with SQL_ID 7vd7gcsk3wkzs
         DECLARE
         thmb ORDImage;
         dicomMD XMLType;
         ctx raw(1) := null;
         BEGIN
         load_dicom_image(1, 'SIRONA2.dcm'),
         SELECT thumb, dicomMeta INTO thmb, dicomMD FROM diagnosticImages
         WHERE id = 1;
         putMetaToFile(dicomMD, 'DICOMIMAGEDIR', 'dicomMeta.xml'),
         thmb.export(ctx,'file','DICOMIMAGEDIR','dicomThumb.jpg'),
         for i in 2..200 loop
         load_dicom_image(i, 'SIRONA2.dcm'),
         SELECT thumb, dicomMeta INTO thmb, dicomMD FROM diagnosticImages
         WHERE id = i;
         commit;
         end loop;
         END;

With Enterprise Manager there is a link you can click on to view the SQL source that is being called out. You can also see the PL/SQL code in question. Note that this may not be a problem, but you can see what PL/ SQL code is being used the most, and can concentrate on that.

Creating Tables with Media Data

In many cases, if not most cases, the best performance for applications that store the media data in the database (in BLOBS) is achieved when the media data is stored in its own tablespace. First, a tablespace must be created to store the media data. It is best to make this a large tablespace with large extension parameters so that the tablespace is not extended much, if at all, and if it has to be extended, it is done rarely.

From a user connected as SYSDBA, the following example creates such a tablespace:

SQL> CREATE TABLESPACE RAWIMG DATAFILE 'images.tbs'
     SIZE 40000M AUTOEXTEND ON NEXT 5000M;

Now that a tablespace is created, a normal user can create a table that stores the raw image data into the tablespace.

SQL> CREATE TABLE images (imageID INTEGER ,image
ORDSYS.ORDImage)
     LOB (image.source.localData) STORE AS
            (
            TABLESPACE RAWIMG
            CHUNK 32K
            NOCACHE NOLOGGING
            );

In this example, we also specify NOCACHE and NOLOGGING.

The NOCACHE parameter indicates the image should not be cached in the SGA. This is a typical setting for media, unless it is expected that the image will be used frequently.

The NOLOGGING specifies that raw image data should not be stored in the REDO log. This increases performance of inserts, and can increase the performance of bulk loading a table. Note that the lack of REDO logging makes it impossible for the administrator to fully restore the database from the REDO logs after a crash, so you may want to keep a copy of the bulk loaded files or a copy of the tablespace in the case of failure. You can turn on REDO logging after the table is bulk loaded with the following PL/SQL:

SQL> ALTER TABLE images MODIFY LOB (image.source.localData)
             ( NOCACHE LOGGING )

Another important parameter is the CHUNKSIZE parameter. This should be larger than the default. Clearly, because media data is larger than data that has traditionally been held in a database, it is more efficiently stored, written, and read when the chunksize is larger. Also note that Oracle maintains an index on each chunk of the LOB for random access to the LOB data. The larger the chunk size, the smaller the index overhead.

Other options to consider when creating a table are:

  • INITIAL and NEXT parameters. These storage parameters indicate the initial size of the BLOB and how large extents should be. These should be larger than the chunk size. These parameters should be used if you have a relatively uniform media size that you can guess beforehand.

  • PCTINCREASE parameter indicates the percent increase a BLOB should extend itself when the BLOB is being filled up piece by piece. This parameter should be set to zero. The default value is 50%, which will increase the BLOB size by 50% on each extension, which is usually too large an extension.

  • DISABLE STORAGE IN ROW parameter is used to disable the media from being stored inline within a row. This is almost always advisable as having large BLOB data stored in the row can cause extra overhead in a full table scan. In any case, the largest segment that will be stored inline in a row is 4,000 bytes. Almost all media is larger than this. ENABLE STORAGE IN ROW is the default and should only be accepted if your media data is smaller than 4,000 bytes long.

  • MAXEXTENTS sets the maximum number of extents for the BLOB. This should almost always be set to UNLIMITED to prevent fragmentation of the BLOB and to make sure the BLOB can extend itself to the maximum size necessary.

Distributing the I/O Load

Using a database to store media opens the door to database administration and load balancing techniques. To distribute I/O over a number of disks, so that a single or limited amount of disks does not become a bottleneck, can involve various techniques:

  • Striping—distributing a database table I/O over a number of disks

  • Automatic storage management (ASM)—automatic management of database storage

Striping is a technique to distribute I/O over a number of disks. This is done by distributing the data among a number of physical disk drives. This can be done by hardware, the operating system, the database, or a combination of any of these. We will concentrate on database striping here.

For example, a database table can be designed to place new records in a table randomly among a set of tablespaces on multiple disks. This will increase the I/O throughput of the database application. The table is partitioned onto a number of tablespaces. In the following example some tablespaces are created as a SYSDBA user.

create tablespace mediats1 datafile 'C:mediats1.tbs' size 10000m;
create tablespace mediats2 datafile 'D:mediats2.tbs' size 10000m;
create tablespace regts1 datafile 'C:
egts1.tbs' size 10m;
create tablespace regts2 datafile 'D:
egts2.tbs' size 10m;
  • The mediats tablespaces are to be used to store large images. The regts tablespaces are to be used to store the thumbnail images and the relational data. This can make for less fragmentation of the relational data. If we want, we can also configure the database to be able to use larger block sizes for tablespaces and apply these larger block sizes, up to 32k, to the tablespace.

  • Now we can create a table that will be randomly distributed between the tablespaces:

CREATE TABLE photosPartitioned
                   (id          NUMBER PRIMARY KEY,
                    description VARCHAR2(40) NOT NULL,
                    location    VARCHAR2(40),
                    image       ORDSYS.ORDIMAGE,
                    thumb       ORDSYS.ORDIMAGE)
  LOB (image.source.localdata)
   STORE AS (disable storage in row nocache nologging chunk 32768)
  PARTITION BY HASH (id)
   (PARTITION p1 tablespace regts1
     LOB (image.source.localdata) store as (TABLESPACE mediats1),
    PARTITION p2 tablespace regts2
     LOB (image.source.localdata) store as (TABLESPACE mediats2)
   );

This table would now be split between disks C: and D:, and furthermore, the largest BLOBS would be in a separate tablespace from the relational data and thumbnail images.

Another option that is automatic, more flexible, and adapts by itself is automatic storage management (ASM). This will automate the distribution of I/O over a number of physical devices without intervention by a DBA. With ASM, the DBA creates a disk group from the operating system disks, and then database files are automatically created and managed by ASM.

ASM offers more than just striping information. It also provides an option to mirror data to help prevent data loss in the case of a disk failure.

Load Performance

The loading of data into the database may be a major concern. Consider the bank that needs to store a huge number of check images a day and then make them available for users to securely view them on the Web. In this instance, it may make sense to make use of transportable tablespaces. By using transportable tablespaces, a database that is not online can bulk load check images into a tablespace or tablespace set. When this tablespace or tablespace set is populated, it can simply be copied over to the online database and made part of the online check image table in a matter of seconds. Since this tablespace or tablespace set is populated by time, it is just as easy to remove the tablespace or tablespace set from the table when the amount of time that a check image can be viewed is exceeded. This is much quicker than deleting all the items individually based on the time they were inserted into the database. This tablespace or tablespace set can then be copied to an archive database if desired. An example of using transportable tablespaces is given in Chapter 6. This is an easy way to distribute the loading operation.

As pointed out before in this chapter, bulk loading performance can also be increased by turning off REDO logging when the data is being loaded. Remember that the downside of this is that there is no REDO log, so you may need another strategy, such as making a copy of the loaded tablespace or mirroring, to preserve your data in the case of computer storage failure.

Another way to increase bulk load performance from files is to partition a table among separate tablespaces on separate disks. The data fields can be on different disks as well. The bulk loading can then be done in parallel, with none of the physical disks contending with each other during the load operation for loading media (unless REDO logging is on, in which case there could be some contention for the REDO log).

For a parallel loading example, the following is done in a SYSDBA authorized account:

create or replace directory IMG_SRC1 as 'x:images';
grant read on directory IMG_SRC1 to scott;
create or replace directory IMG_SRC2 as 'y:images';
grant read on directory IMG_SRC2 to scott;

create tablespace mts1 datafile 'C:mts1.tbs' size 100000m;
create tablespace mts2 datafile 'D:mts2.tbs' size 100000m;

create tablespace ts1 datafile 'C:	s1.tbs' size 100m;
create tablespace ts2 datafile 'D:	s2.tbs' size 100m;

This creates two image source directories on two separate disk drives. It also creates two partitions each on two separate disk drives. On each drive with partitions, one partition is for the relational data and one partition is for the media, both thumbnail and original image in this case.

The stage is now set to create the table in the SCOTT schema:

CREATE TABLE photosPart
                   (id          NUMBER PRIMARY KEY,
                    description VARCHAR2(40) NOT NULL,
                    location    VARCHAR2(40),
                    image        ORDSYS.ORDIMAGE,
                    thumb        ORDSYS.ORDIMAGE)
  LOB (image.source.localdata)
     STORE AS (disable storage in row nocache nologging chunk 32K)
  LOB (thumb.source.localdata)
     STORE AS (disable storage in row nocache nologging chunk 8K)
  PARTITION BY RANGE(id)
  (PARTITION p1 VALUES LESS THAN (1000000) TABLESPACE ts1
          LOB (image.source.localdata) store as (TABLESPACE mts1)
          LOB (thumb.source.localdata) store as (TABLESPACE mts1),
   PARTITION p2 VALUES LESS THAN (MAXVALUE) TABLESPACE ts2
          LOB (image.source.localdata) store as (TABLESPACE mts2)
          LOB (thumb.source.localdata) store as (TABLESPACE mts2)
  );

This table puts the BLOBS into the tablespaces created for the BLOBS, and the relational data in the tablespaces created for the relational data. Also, the table is partitioned by range, so that we can load the partitions on the two physical disks independently.

Now it is possible to load the table in parallel, without contention for one disk drive. Each of the following BEGIN ... END blocks would be performed in separate sessions executing at the same time. In a real application, the loading would load multiple images in each of the two sessions:

BEGIN
  load_part_image(1, 'Nice picture', 'Somewhere',
                     'IMG_SRC1', 'loon.jpg'),
END;
/
BEGIN
  load_part_image(1000001, 'Nicer picture', 'Somewhere Else',
                     'IMG_SRC1', 'goldfinch.jpg'),
END;
/

These same principles can be applied to other loading methodologies, such as using external tables. In this case, there would be multiple external tables whose data is on separate physical disks. A procedure for each external table would be used in separate sessions to load the data into the database.

The data stored or retrieved in interMedia BLOBS is typically (almost always) done in a fashion where the entire BLOB is read or written at once (sometimes called check-in check-out), so for the most part considerations on reading and writing pieces of the BLOB do not apply in interMedia applications. One noteworthy exception to this is when a streaming server streams an audio or video file from the database. However, since plug-ins for streaming servers are written for you, you may only have to consider these if you write your own streaming server plug-in.

Performance Tools

As mentioned before in this chapter, your best performance friend is the combination of Automatic Workload Repository (AWR) and Automatic Database Diagnostic Monitor (ADDM). Using these tools, especially in combination with Enterprise Manager, makes it easy to tune your database.

Other tools exist for optimizing PL/SQL code. If it is thought that a particular PL/SQL is taking many resources, as identified by ADDM, this PL/SQL code can be analyzed with the SQL Tuning Advisor.

Delivery on the Web

After any database issues have been resolved, it may then be time to look at the middle tier or application server. Again, your best friend here is Enterprise Manager.

Note that as in many of the samples in this book, the most efficient delivery of data is data that is not delivered. Make use of the browser cache when possible. This is done by setting an expires tag, Etag tag, or the cachecontrol max-age directive, like in a media response, and by returning a NOT-MODIFIED HTTP response to an HTTP conditional request.

Another place to look for performance improvements on the Web is by the use of Oracle Cache. The application can control how Oracle Cache, or any other cache mechanism along the HTTP path, handles media by setting the HTTP header cache-control field. The cache-control directives specify how and if the media is cached. This would offload the handling of HTTP responses for media from the HTTP server that the media is delivered from.

Backup

Storing media in the database greatly simplifies the data for a multimedia system as a consistent backup. In the case of previous implementations, using a combination of database and media files, it is difficult to keep the metadata in the database consistent with the raw data in multimedia data files. In this hybrid system, there must be two backups, one for the database and one for the media files. It could also be near impossible to do a consistent online backup while the system is running.

The database offers a myriad of strategies for backing up data and making this backup consistent and efficient for both full and incremental backups. Recovery Manager (RMAN) is used to make physical backups of a database. Logical backups of a database can be done with Oracle Data Pump. RMAN can be accessed through PL/SQL commands as well as through the Oracle Enterprise Manager GUI.

Some of the database facilities that can be used are:

  • Full and incremental backups.

  • Backup to disk or tape.

  • Applying an incremental backup to a full backup on disk to make a more up-to-date full backup.

  • Backup and recovery information can be stored on ASM-managed devices that can be mirrored for data loss protection and striped for better performance.

  • Block change tracking (BTC) to keep track of blocks that change so that incremental backups are made much faster.

  • The ability to partition data into transportable tablespaces based on time to allow for backing up tablespace set files directly.

Of course, multimedia information takes much more storage space than traditional data. As with any backup strategy, it should be tested and bench-marked to make sure the performance goals in terms of how long it takes to backup the database and how long it takes to restore a database are met.

Summary

To get the best performance from your interMedia application, it is necessary to consider all aspects of the system. Loading data, backing up data and application performance are potential targets for tuning. It is important to tune the database with interMedia in mind, but the other layers such as the application and/or mid-tier layer should be considered.

The database has automated techniques to help tune the database parameters to their optimal values. Tables can be designed to spread I/O over multiple physical disks so that a single physical device does not become a bottleneck.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.93.12