Clean Selected Data out of a Table

The above methods are workable if you want to delete all rows from a table. What about when you need to delete only some of the rows? Or you need to move some of the rows out of the table and archive them in another table. This is a common task in database management, and it is one that seems to be forgotten many times in the design of applications.

Programmers seem to spend a lot of time figuring out how to put data into tables, but they often do not spend quite as much time devising methods to remove obsolete data. In many ways, the database is a black hole to developers. They tend to think that they can continue to add data to it forever. But the DBA knows different. We have to clean these things up!

As a DBA, you should be a strong advocate for an entire life-cycle approach to designing your applications. Every time a developer approaches you with a request for a new table, you should inquire about how the data gets into the table and how it goes away. Get the developers to thinking about means of keeping the data current and getting rid of the trash. Data does not have to be kept online forever.

There are also significant technical problems with deleting specific rows from a table, especially a large, frequently used table. The SQL delete statement will place locks on the table being deleted from. The level of locking will be the lock level as established for the table unless an explicit table lock is placed on the table. If the locking granularity is either row or page, it is possible that a long delete will overflow your lock table. Such a delete is also considered a transaction and could possibly throw you into a long transaction situation with regard to your logfiles. The same problems apply to updates to tables and to saving rows to history tables. The history table problem can be broken down into two parts, inserting the row into a history table and deleting the row from the old table.

The major problem here is that the deletes, updates, or movements of data are done in one big transaction. Solving the problem requires that the tasks be broken into smaller manageable units.

The following 4GL program is called sample.4gl and it utilizes a concept I have called flutter locks. Flutter locks are locks that are placed and released on small units of data. As such, they are much less intrusive in day-to-day operations. When using programs similar to sample.4gl, you can have archiving and table cleanup programs that will peacefully coexist with an active user community.

Programs using flutter locks should pay careful attention to the concept of a transaction. A transaction is a group of statements that should be completed as a unit. If any of the parts are not completed, none of the statements is executed. For most deleting and archiving situations, these transaction units can be made at the row level. Either a row needs to be deleted or it doesn't. In an archive situation, the transaction should consist of two parts. First, copy the row to the new table. Second, delete the row from the old table. Most of the time you really don't need the very broad transaction units utilized by the DELETE or UPDATE SQL statements. These statements interpret the transaction as, "DELETE all of the rows covered by the WHERE clause or ROLLBACK all of the rows." If your application requirements really require such an all-or-nothing level of transactions, using a flutter lock program will not solve the problem. Such instances are rare, however.

In the case of an archive program in which you are moving rows based upon certain criteria, the world will not fall in if you miss a row or two. They'll get archived next time the program runs. One very important implication of using a flutter lock program for deleting or row archiving is that the jobs can be easily interrupted. Try interrupting a long SQL DELETE statement after it's worked for an hour, and you'll see about an hour of rollback. Even worse, imagine what happens if you've had a DELETE or UPDATE SQL statement going on for several hours and your system crashes. Your fast recovery on restart will be anything but fast. Your system will probably be down for hours recovering.

Using flutter locks allows you to perform this "transaction chopping" and to completely avoid most of these problems. I have chosen to demonstrate this in I-4GL rather than in ESQL/C. The same programming concepts are available in both tools.

The general concept of the program is to first open a cursor in which you select the rows on which you wish to operate. You may then wish to DELETE the data, archive the data to another table, or perform other SQL operations on the data. The key thing to note is that this outer cursor is opened using the parameter WITH HOLD. This causes the program to maintain its position within the list of rows that need to be acted upon.

For each row in the outer cursor, the program goes into the table, does its BEGIN WORK, locking the rows, does its DELETE or UPDATE statement, and does its COMMIT WORK. Then it goes out to the outer cursor again and picks its next row, etc. Without the hold cursor, each time it went out to the outer cursor, the SQL statement would have to be reevaluated and a new SELECT done.

When you get your rows from the cursor, you can do any amount of work you wish in the inner sections. An archive requires an INSERT into another table and a DELETE from the old table, but you can perform much more complicated tasks within the inner loop.

Just remember to think about keeping the actual work within the transaction to an absolute minimum. Also, it is necessary for you to closely monitor the success or failure of the jobs done within the transaction. If the status indicates that the work failed, you need to be sure to roll back the entire transaction explicitly.

A couple of things to note about the sample.4gl program. First, don't worry about trying to work through the outer_count and quitflag variables. They are in there so that you can group the transactions into bigger transactions, maybe committing after every 100 or 200 executions. There are cases in which the overhead cost of doing a BEGIN WORK/COMMIT WORK for every single row in the outer cursor would be prohibitive.

Rather than COMMITTING after every single, atomic transaction, this code allows you to group several hundred such atomic transactions together. Either they all succeed or they all fail. If you need the capability, it is available. I've found that committing after every atomic transaction does not create a problem with resources in most cases. Also, note that the 4GL program follows the PASCAL conventions of including comments within pairs of curly braces:

       { this is a comment }

joe 58> cat sample.4gl
DATABASE admin
GLOBALS
DEFINE s_order RECORD like order.*
DEFINE s_order_tran RECORD like order_tran.*
DEFINE counter INTEGER
DEFINE outer_count INTEGER
DEFINE   quitflag INTEGER
DEFINE    total INTEGER
END GLOBALS

MAIN
CALL startlog("d_order.errors")
ALTER TABLE order LOCK MODE (row)
IF status < 0 THEN
WHILE status < 0
SLEEP 2
ALTER TABLE order LOCK MODE (row)
END WHILE
END IF
{ comment:  all of this status checking is to be sure that the lock }
{           mode really gets changed. The row locking makes the program }
{           even less intrusive to users. }
{ NOTE:  this will cause short lockouts during operations. Make sure that }
{           this short lockout will not crash other users }
ALTER TABLE order_tran LOCK MODE (row)
IF status < 0 THEN
WHILE status < 0
SLEEP 2
ALTER TABLE order_tran LOCK MODE (row)
END WHILE
END IF
DECLARE drop_order CURSOR WITH HOLD FOR
select * from order where
due_date < current - 3 units month
AND
start_date < current - 3 units month
FOR UPDATE

{  Comment:  this is the outer cursor. The "with hold" is critical }
{            the "for update" is what allows you to do a delete inside }
{            the loop with a "current of cursor" clause }

DECLARE copy_order CURSOR WITH HOLD FOR
INSERT INTO order_hist VALUES (s_order.*)
{ doesn't always need to have "with hold" }
LET outer_count = 0
LET quitflag = 0
OPEN drop_order
OPEN copy_order
WHILE TRUE
LET counter = 0

WHILE counter < 1
{comment:  if you wanted to change your number of transactions per }
{          commit, change the 1 to however many rows you want to work }
{          between transactions. }

BEGIN WORK
{    this is the beginning of the transaction. Keep the transactions short }
FETCH drop_order into s_order.*
IF Status = NOTFOUND
THEN COMMIT WORK
LET quitflag = 1
EXIT WHILE
END IF
{ always check to see that the statement completes properly }
PUT copy_order
{ first, insert into the history table }

IF status < 0 THEN
ROLLBACK WORK
CALL errorlog("ERROR........SKIPPING")
EXIT WHILE
END IF
INSERT INTO order_tran_hist SELECT *
FROM order_tran
WHERE order_number = s_order.order_number
AND   order_type   = s_order.order_type
{ do a little more work with another table while we're in the loop }
IF status < 0 THEN
ROLLBACK WORK
CALL errorlog("ERROR........SKIPPING")
EXIT WHILE
END IF
DELETE from order_tran
WHERE order_number = s_order.order_number
AND   order_type   = s_order.order_type

{delete from the other table}
DELETE from order
WHERE current of drop_order
{delete the current row in the outer cursor table }
LET counter = counter + 1
COMMIT WORK
LET total = outer_count + counter
DISPLAY total

{ provide a heartbeat to the standard out to let us know the program' }
{ still alive and where it's working. You can direct the standard out }
{ of the compiled program by invoking the program as : }
{      sample.4ge >! sample_log & }

END WHILE
IF quitflag = 1
THEN EXIT WHILE
END IF
LET outer_count=outer_count + 1
END WHILE

ALTER TABLE order LOCK MODE (page)
IF status < 0 THEN
WHILE status < 0
SLEEP 2
ALTER TABLE order LOCK MODE (page)
END WHILE
END IF
{ put the tables back in their old lock mode }
ALTER TABLE order_tran LOCK MODE (page)
IF status < 0 THEN
WHILE status < 0
SLEEP 2
ALTER TABLE order_tran LOCK MODE (page)
END WHILE
END IF
END MAIN

The sample.4gl program is compiled within the INFORMIX-4GL program or directly with a command-line interface. It can be run from a command-line interface, or it can be put into your cron file to run automatically.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.67.54