Appendix B Use Cases

Throughout this book, many of the new ASE 15 features as well as some late 12.5.x changes are covered from a technical standpoint. To supplement some of the material offered in the book with practical applications, this appendix demonstrates the use of ASE 15 new features in real business scenarios.

This appendix presents three business cases, employing the following ASE 15 new features:

Case 1:

• Semantic partitions

• Computed columns

Case 2:

• XML

• Computed columns

• Functional indexes

Case 3:

• Scrollable cursors

Business Case 1

For business reasons, Company X must maintain a large amount of historical data within ASE’s databases available to the user base. As the volume of data expands over time into hundreds of gigabytes of space utilization, the maintenance windows continue to increase and performance degradation is apparent. Additionally, the business has evaluated several possible scenarios on how to manage this data explosion, while balancing data availability, maintenance windows, and complexity of implementation.

For Company X four basic options are proposed, each with its own inherent advantages and some limitations. The options presented assume the company cannot simply delete old data or take historical data into an offline state. The options also assume all data will be retained in one ASE server in order to avoid additional infrastructure and to control license costs.

Option 1: One VLDB with large tables self contained

Option 2: Primary database and an archive database for the largest tables

Option 3: Several small databases, each holding a portion (none overlapping) of the data from the very large tables

Option 4: One VLDB with a large table, partitioned semantically

** Option 4 is possible due to changes to ASE in version 15.

How can ASE 15 help this organization? With semantic partitions, partition-level maintenance (partition reorgs, statistics, dbccs), local indexes, and partition-aware optimization.

Option 1: One VLDB with large tables self contained

Architecture Overview

With this solution, one table — potentially very large — resides within one database on one ASE server. Users access this one table directly to satisfy queries. Database administrators perform maintenance to the large database and large tables as a whole. For very large APL locked tables, reorganization of the tables must be performed with a drop and recreate of the clustered index. In order to accomplish a drop and recreate of the clustered index on the large table, upward of 200% of the table’s size must be available as free space within the database to recreate the clustered index and thus eliminate any fragmentation. For DOL locked tables, reorg rebuild operations must also maintain up to 200% of large table size to complete reorg rebuild operations. dbccs must be performed on the single database or table as a whole. Update statistics must be performed, targeting the specific indexes or columns, table wide. In many instances, the update statistics operations and the reorganization of fragmented data is not possible with large amounts of data due to the maintenance window requirements to perform these operations.

Advantages

No special programming logic to access the one large table

No need to create a view to span multiple databases or tables to simplify application development

Initially the simplest solution to implement

Limitations

Maintenance performed on the large tables and databases as a whole

dbccs, reorgs, index maintenance, and update statistics operations must address full tables and/or databases

Optimizer must consider table as a whole when useful index is not present

Must carry overhead of extra free space within the database to perform reorg/index rebuilds for largest table

Must physically delete old data from the production database Cannot make older data read-only

Recovery operations can be more time consuming than other proposed solutions as Backup Server must load database as a whole, including allocated but not used database pages when performing a full restore

Option 2: Primary database and an archive database for the largest tables

Architecture Overview

With this solution, two databases are created. One database contains a table that holds the data that is used to provide results for a period of time that is considered “current.” In this use case the contents of that table provide results for 85% of the application queries. The second database contains a large table — potentially very large — containing data that is maintained for historical purposes, such as contract requirement, government requirement, etc. Database administrators perform maintenance on the two databases and the two tables as a whole. For the smaller database/table, the response time of the application is of major concern as is any impact by database maintenance activities.

For a very large APL locked table in the larger database, reorganization of the table must be performed with a drop and recreate of the clustered index. In order to accomplish a drop and recreate of the clustered index on the large table, upward of 200% of the table’s size must be available as free space within the database to recreate the clustered index and thus eliminate any fragmentation. Even though the smaller database/table requires the same amount of extra space for recreating the clustered index, the fact that the entire table is locked during the index rebuild will affect the uptime of the application.

For DOL locked tables, reorg rebuild operations must also maintain up to 200% of large table size to complete. dbccs must be performed on both databases and tables as a whole. Update statistics must be performed targeting the specific indexes or columns, table wide. In many instances, the update statistics operations and the reorganization of fragmented data is often not possible with large amounts of data due to the maintenance window requirements to perform these operations.

The performance on the current data database will be affected while the data is copied to the historical database and then dropped from the current database/table.

For queries that need to access data from both the current and the historical tables, the data can be unioned. For those queries where the data is from either one or the other database, a mechanism has to be developed and maintained that tracks where the date ranges occur. This mechanism has to be accessed to determine the database to use to resolve the query each time data is requested. In some rare cases, the request may have to be broken down into two queries to be resolved, particular if the data is being archived at the time of the request.

Advantages

Maintenance can be performed on current data, as current data is contained in only the primary database.

Application development could be simplified with the creation of a view of the “same” table across the current and history databases.

The archive database can be placed into a read-only mode.

Maintenance of the archive database will decline dramatically when in read-only mode as changes are not permitted, except for when data is archived on a periodic basis from the primary database.

Limitations

Must carry out joins across two databases if a query needs simultaneous access to current and history data

Cannot update history data with database in read-only mode

Special programming logic to access the multiple tables across two databases due to the introduced physical fragmentation

Overall, more difficult for DBA to manage

Often, downtime is scheduled to perform archive operations from the primary to the secondary database. If downtime is not available with this option, additional resource utilization can take place during the archive of data from the primary to archive database.

Option 3: Several small databases, each holding a portion (none overlapping) of the data from the very large tables

Architecture Overview

In this option, smaller tables are created in several smaller databases to handle the data from an otherwise very large table. As the data is loaded into the appropriate table in a monthly database, the other tables are not affected. Only queries accessing the data in that month are blocked from accessing the new data during the load process. As in the previous option, a mechanism has to be developed and maintained in order for the application to determine where to find the requested data. As data is aged and no additional data is added to the older tables, limited maintenance will be necessary against the older static tables and it may be possible to set the older databases as read only. Clustered indexes will only need to be rebuilt against the current data. This reduces the maintenance windows since only one table or database requires maintenance to be performed against it.

One last issue is that the next database/table has to be created by the DBA prior to when it is needed. If the DBA fails to create the database or table, the data load process will fail.

Advantages

Ability to backup/restore at a more granular level

Ability to make databases with older data read-only

May be feasible to create a view to join together the same tables that are fragmented across multiple databases

Limitations

Introduces fragmentation at the table level into the system design

Need for special logic to access data that scans horizontally partitioned tables that physically span multiple tables in multiple databases

Overhead must be carried out in each individual database to accomplish reorgs and index maintenance

More overhead for database administrators to maintain additional databases

Option 4: One VLDB with a large table, partitioned semantically

Architecture Overview

With ASE 15, this option is now possible. Although the data will be stored in one large table, partitioning allows for maintenance to be performed on only the necessary partitions. Also, the application does not need to be aware of where the data is located — on which database and in which table. The optimizer can eliminate unnecessary access to partitions where the data is not needed to satisfy the request. Since the partition key only needs to know which month is being stored, a materialized computed column can be created that uses only the month part.

Advantages

No special programming logic to access segregated data

Only one table/database to access

Maintenance operations, such as reorgs, dbccs, index maintenance operations, and update statistics, can be targeted to very specific subsets of data

Partition-aware optimization. Optimizer will eliminate from consideration the partitions not needed to satisfy a query

Start_date can be an indexed computed column for use by the Query Optimizer

Limitations

Backups/restores of database can be time consuming

Cannot drop old partitions (old data), only truncate

Cannot make portions of the data read-only

Database administrator should monitor for partition balance with some partition types

Business Case 2

Company Y’s system must now consider the integration of XML document storage within their relational database. The company needs the ability to quickly scan for keywords within the XML documents in order to determine which documents to retrieve.

How can ASE 15 help this organization? With consideration to XML storage methods (internal vs. external) to the database, computed (materialized) columns, and functional indexes.

Option 1

If database storage isn’t an issue and only a few elements within the XML documents are needed to retrieve the documents, then the easiest way to accomplish the above goal is to create a table with a text datatype to store the XML documents and computed columns based on elements within the XML documents. For example:


create table xml (xml text,
ID int compute xmlextract('//ns0:geographyID/text()', xml)
materialized)

By making those columns materialized, they can have functional indexes added on them. This speeds up the ability to find the correct XML documents without having the overhead of indexed “parsed” XML documents.

Advantages

Fast access to data within the XML documents where a functional index is employed

Storage requirement requires little more than the actual raw size of the actual XML document

Limitations

Slow access to data within the XML documents for large documents and the data is not contained within a materialized column

Extended time to load the XML data into ASE for large documents materializing a number of columns

Option 2

If the number of elements needed to find the XML documents becomes great, the number of materialized columns can be limited to one or two. Then the xmlextract function can be used within the where clause to find the XML documents. For example:


where ID = 100 and
xmlextract('//ns0:geographyID/text(), xml) = 200

This might create a performance issue if the XML documents are too large. To increase performance, the XML documents would need to be indexed or “parsed” using the xmlparse function.

Advantages

Fast access to data within the XML documents

More flexibility in changing the columns being used in the where clause without causing performance issues

Limitations

The storage requirements are more than double those of Option 1.

The memory requirement and the time needed to load the indexed XML documents increases exponentially with the size of the documents.

Option 3

If database storage becomes an issue, an option would be to store the XML documents in a compressed form in an image datatype. This can only be done if the needed elements can be extracted from the XML documents outside of the database and then inserted into separated columns within the table. This gives the same ability to retrieve the XML documents as Option 1, but places the work to extract the elements within the XML documents on the application outside of the database. It also forces the application to compress and uncompress the XML documents.

Advantages

Considerably less storage requirements within ASE

XML in the database license not required

Limitations

Cannot use the xmlextract function within the where clause for data with the XML documents

Additional logic needed in application logic to store and retrieve XML documents

Option 4

Another way to solve a database sizing issue is to store the XML documents into a file system directory and map the XML documents to a logical proxy table. A view can then be created to eliminate the unnecessary columns from the proxy table and to add additional columns based on the XML document file name and/or data within the XML document.

With this option, users cannot simultaneously insert additional columns for data not contained within the XML document. All the other options would allow descriptive data to be added within the same row as the XML document. This data could then be used to query for the XML document. Within this option, there is no physical table within the database containing the XML document; therefore, descriptive data would have to be contained within its own table and related to the XML document via some ID common to both. The appearance of the XML document through the proxy table and the inserting of this descriptive data cannot be wrapped within a database transaction.

The following is an example of creating a proxy table and view:


create proxy_table _xmlDirectory external directory at
"/remotebu/disk_backups/xmlDirectory"
create view xmlDirectory
(ID, code, xml) as
select convert(integer, substring(filename, 6, 3)),
convert(varchar(10), xmlextract('//ns0:geographyID/text()',
content)), content
from _xmlDirectory
Advantages

No need to store XML into ASE

No memory requirements for storing XML into ASE

Limitations

Users cannot simultaneously insert additional columns for data not contained within the XML document

Slow access to data within the XML documents for large documents

The "enable file access" and "enable cis" options have to be set for this option

Business Case 3

Background on Current Architecture

Company Z runs an order processing system. Due to limitations of their physical production facilities, a request was created by management to indefinitely postpone 50 customer orders, beginning with the 50th order beyond the current order. Management believes this arbitrary delay of 50 orders will allow for the facility to catch up with current orders. To identify and prevent the production of the delayed orders, the customer_order date will be future dated to “12/31/3000.” Because the orders table acts as a queue for the processing of orders, as the 50 orders with a future start date are obtained, production at the facility will begin to slow as the new orders cannot begin to process due to the future dated start date associated with the order.

Proposed Solutions

The IT developer assigned to this project at Company Z considers three alternatives in order to accommodate this request. The first is the use of #temp tables in order to isolate the 50 rows that need their order dates updated. With this method, rows are extracted from one or more user tables and from one or more databases into a temporary table in the tempdb database. Within the tempdb database, a user process continues by isolating the 50th through 100th rows, with “set” logic and “select into” along with the creation of additional #temporary tables. Once the 50th through the 100th rows are isolated, updates are applied to the base table(s).

In the second example, the IT developer considers the use of standard pre-ASE 15 cursors. Using this methodology, a process reads 1,000 or more rows into the nonscrollable cursor. Since the developer is only interested in the 50th through the 100th rows of the cursor, additional processing is necessary. In order to target the 50th row, the cursor must fetch and discard the 1st through the 49th cursor rows before obtaining the needed 50th row. Once the 50th row of the cursor is obtained, processing proceeds by updating the base table where the current cursor row relates to the rows in the base table(s).

The third option is only possible with ASE 15.0 and greater. The third option employs the use of a scrollable cursor to directly position the cursor pointer at the 50th row in the cursor. From the 50th row forward, the cursor can traverse forward until all necessary rows are accessed and the corresponding base table is updated.

Note: While the scenario presented in this business case is fictional and may not be realistic, the message the authors convey is the simplicity and elegance presented with the scrollable cursor in comparison to the complexity associated with the options where a more traditional cursor or temporary table are employed.

Option 1

To facilitate this scenario, a temporary table called #order_temp is generated from a select into statement:


-- build the temp table with all orders that have not started
processing.
select * into #orders_temp
from orders
where order_date > (select max(order_date)
from orders
where processed = "Y")
and processed = "N"

Next, the first 50 rows would need to be deleted from the temporary table:


set rowcount 50
delete #orders_temp
go

Next, move the 50 target rows to a new temporary table called #order_temp2:


-- set rowcount remains at 50
select * into #order_temp2
from #order_temp
go

set rowcount 0 -- rowcount of 50 no longer needed
go

delete #order_temp -- table no longer needed, the 50
go -- rows needed are now in table
-- #order_temp2

Using the same interim set, update the 50th to 100th rows in the orders table:


update Accounts..orders
set order_date = "12/31/3000"
from orders c,
#order_temp2 t
where t.order_id = c.order_id

Processing is complete; remove the #temporary table with drop table:


drop table #order_temp2
go
Advantages

Can be performed on any supported version of ASE

The temporary table could be made to persist beyond the scope of the user session if necessary.

Can take multiple passes at each row in the #temporary table if needed

Limitations

Moderate complexity

Must complete multiple steps to isolate the target rows

Locks briefly held on tempdb’s system tables

Option 2

For Options 2 and 3, the following diagram represents the rows needed by the cursor.

Cursor CSR1

Declare and open the cursor to process the rows:


declare CSR1 cursor for
select order_id, order_date
from orders
where order_date > (select max(order_date)
from orders
where processed = "Y")
and processed = "N"

open CSR1

Set the cursor rows to 50 in order to fetch the first 50 rows, and discard them:


set cursor rows 50 for CSR1

fetch CSR1

Reset cursor rows to 1 since the rows to process are next:


set cursor rows 1 for CSR1

Fetch the cursor rows into local variables, then use the variables to update the base table within a while loop to perform the update 50 times:


declare @order_id int,
@order_date datetime,
@counter int

select @counter = 50

while (@counter > 0)
begin

fetch CSR1
into @order_id, @order_date

update Accounts..orders
set order_date = "12/31/3000"
where order_id = @order_id

select @counter = @counter -1
end

Verify the correct rows were updated:


select * from orders where order_date = "12/31/3000"

Close and deallocate the cursor:


close CSR1
deallocate CSR1
Advantages

Can be performed on any supported version of ASE

Limitations

Additional steps to fetch and discard the unneeded rows

Can only take one pass at each row if multiple passes per row were necessary

Moderate to low complexity

Must complete multiple steps to isolate the target rows

Option 3

Use a scrollable cursor to set the cursor position directly at the 50th row in the result set.

Declare and open the cursor to process the rows:


declare CSR1 scroll cursor for
select order_id, order_date
from orders
where order_date > (select max(order_date)
from orders
where processed = "Y")
and processed = "N"

open CSR1

Fetch the cursor rows into local variables, then use the variables to update the base table within a while loop to perform the update 50 times. Use the fetch orientation of absolute to skip to the 50th cursor row and to control the cursor position.


declare @order_id int,
@order_date datetime,
@counter int
select @counter = 50

while (@counter <= 100)
begin

fetch absolute @counter CSR1
into @order_id, @order_date

update orders
set order_date = "12/31/3000"
where order_id = @order_id

select @counter = @counter + 1
end

Verify the correct rows were updated:


select * from orders where order_date = "12/31/3000"

Close and deallocate the cursor:


close CSR1
deallocate CSR1
Advantages

Low complexity

Most elegant solution; can directly access the rows needed in the result set without special processing

Can take multiple passes at each row if necessary

Limitations

Can only create and manipulate scrollable cursors on ASE 15 and greater

Will use tempdb to hold the scrollable cursor

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.20.56