22.2. General Discussion of Facet Restrictions

This section details the general design issues and possible implementations of facet restrictions that vary little with respect to the XML Schema datatype. Each of the constraining facets can be implemented as either table check constraints or triggers (detailed in Section 22.4 later in this chapter).

22.2.1. pattern Constraining Facet

pattern constraints, in general, cannot be represented in SQL or PL/SQL. SQL has a built-in operator for single-character and multiple-character wildcards. In addition, SQL has functions that can be used to determine whether some patterns have been violated. You can add more general mechanisms by using Java stored procedures.

SQL has the LIKE operator that can be used for wildcard matching. Single characters can be matched with a ‘?’ character and multiple characters can be represented with a ‘%’ operator. For example, the assemblyPartNumberType in catalog.xsd has a pattern constraint of ‘ASMd{1,8}’. This can be partially fulfilled with the following SQL fragment:

CatalogItem.assemblyPartNumber LIKE 'ASM%' 

and more completely fulfilled by checking for minimum and maximum length with the following SQL fragment:

LENGTH(CatalogItem.assemblyPartNumber) BETWEEN 4 AND 12 

To represent the digit portion of the pattern constraint, the following SQL fragment can be used:

CAST(SUBSTR((CatalogItem.assemblyPartNumber, 4, 8) AS NUMBER) IS NOT NULL 

This use of SQL and PL/SQL functions works for simple patterns, but more complex patterns become more difficult to represent. In the case of a partNumberType in catalog.xsd, the restriction pattern is ‘[A-Z]{1,3}d{1,8}’. In this case, TRANSLATE could be used to determine that the part number had one to three characters and one to eight digits, but not that the alphabetic characters precede the numeric characters.

Listing 22.1 contains a SQL statement that checks for a simple pattern using the LENGTH, SUBSTR, TRANSLATE, and UPPER SQL and PL/SQL functions.

Listing 22.1. A SQL Statement That Checks for a Simple pattern
SELECT 1 
FROM DUAL 
WHERE LENGTH(?value) >= 2 AND 
   -- There are between 1 and 3 alphabetic characters 
   -- in the first 3 characters. 
   LENGTH(TRANSLATE(SUBSTR(UPPER(?value), 1, 3), 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ')) 
     BETWEEN 1 AND 3 AND 
   -- There are no more alphabetic characters after 
   -- the first 3. 
   LENGTH(TRANSLATE(SUBSTR(UPPER(?value), 1, 3), 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ')) = 
   LENGTH(TRANSLATE(UPPER(?value), 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ')) AND 
   -- There are between 1 and 8 numeric characters after 
   -- the first character. 
   LENGTH(TRANSLATE(SUBSTR(UPPER(?value), 2, 
               LENGTH(UPPER(?value)) - 1), 
           '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', 
           '0123456789')) 
     BETWEEN 1 AND 8 AND 
   -- The first character is not numeric 
   SUBSTR(?value, 1, 1) NOT IN ('0', '1', '2', '3', '4', 
                 '5', '6', '7', '8', '9'), 

This code example returns the value ‘1’ when the ?value bind variable is replaced with a string that matches the pattern ‘[A-Z]{1,3}d{1,8}’ and no rows when it does not. This code makes the assumption that only alphabetic and numeric characters are present in the bind variable.

As this example demonstrates, this more complicated pattern can be translated by writing more SQL. The representation in the XML schema document and the PL/SQL become further and further apart as the amount of PL/SQL grows because the XML schema document representation is declarative and the PL/SQL is procedural.

One possible solution to this and the patterns that cannot be handled is to use the pattern-matching functionality built into the JDK1.4 through the classes in the java.util.regex package. Although the Oracle JVM’s version is 1.3, specific 1.4 classes can be loaded into the Oracle JVM and then wrapped in a PL/SQL function. This approach involves three major issues:

  • The XML Schema pattern language and the Java pattern language are not identical but are both based on Perl regular expressions. This means that XML Schema regular expressions must be translated into Java regular expressions.

  • There is a large, potentially significant one-time loading cost. Oracle will load a JVM into the database session for this trigger assuming that the application does not use Java stored procedures elsewhere.

  • Although Oracle has made large strides in reducing the performance gap between PL/SQL procedures and Java procedures, the gap still exists. Instantiating Java classes and performing pattern matching is likely to be an expensive operation compared to using the built-in PL/SQL functions.

  • Installing the java.util.regex classes involves significant effort because they rely on other Java V1.4 classes. Some amount of refactoring of these classes is probably necessary.

Java pattern matching is probably best avoided by using check constraints or triggers in production databases unless it is critical that the database verify that the pattern constraint is not violated. This need is ameliorated when the software systems(s) communicating with the database can be assumed to have already performed this function or when the pattern enforcement is not critical to the business systems.

22.2.2. enumeration Constraining Facet

The enumeration constraining facet restricts a valid XML document to a fixed set of choices. Enumerations can be represented in the database in the following ways:

  • Hard-coded values— appropriate if the number of values is small, fixed, unlikely to change, and there is no requirement for an application to retrieve these values.

  • A separate table with the domain values— appropriate if the number of values is likely to change, there are few enumerations, and referential integrity is required.

  • A “picklist” table with the domain values for all enumerations— appropriate if the number of values is likely to change, there are numerous enumerations, and referential integrity is not required. The word “picklist” is used because in many applications the picklist values form the basis for combo boxes, drop-down menus, and so on.

22.2.2.1. Hard-coded Values

The most straightforward representation of hard-coded values is with a database constraint to block insert or update from succeeding.

For example, catalog.xsd includes a simple type named colorOptionType that has four options: CYAN, MAGENTA, YELLOW, and BLACK. These four choices can be represented as either strings or numbers. Listing 22.2 is a table-creation script with a column constraint.

Listing 22.2. Create the Hard-coded Picklist Values Table
CREATE TABLE HardcodedPicklistExample ( 
value VARCHAR2(30) CHECK (value IN ('cyan', 'magenta', 
                  'yellow', 'black')) 
); 

22.2.2.2. Separate Table

A separate table is created to hold the values of the domain and other tables that need this domain. Simply create a foreign key relationship to guarantee referential integrity. Listing 22.3 contains a SQL statement that creates the ColorOptionType table and other SQL statements that insert the initial values.

Listing 22.3. Create a Separate ColorOptionType Table and Insert Initial Values
CREATE TABLE ColorOptionType ( 
colorOptionTypeID NUMBER(4) NOT NULL, 
name     VARCHAR2(30) NOT NULL, 
description   VARCHAR2(255) NULL, 
PRIMARY KEY(colorOptionTypeID)); 

INSERT INTO ColorOptionType 
(colorOptionTypeID, name, description) 
VALUES 
(1, 'cyan', 'test'), 

INSERT INTO ColorOptionType 
(colorOptionTypeID, name, description) 
VALUES   
(2, 'magenta', 'test'), 

INSERT INTO ColorOptionType 
(colorOptionTypeID, name, description) 
VALUES 
(3, 'yellow', 'test'), 

INSERT INTO ColorOptionType 
(colorOptionTypeID, name, description) 
VALUES 
(4, 'black', 'test'), 
COMMIT; 

The choice of a NUMBER(4) for the primary key allows this table to hold 9,999 colors (assuming that the key values are positive and one-based). The name and description have 30 and 255 characters, respectively, allowing for long names and descriptions.

Here is an example table creation script to use the ColorOptionType table:

CREATE TABLE SeparateTableExample ( 
value NUMBER(4) NOT NULL, 
FOREIGN KEY (value) 
REFERENCES ColorOptionType); 

This table now has a referential integrity constraint to the ColorOptionType that restricts the appropriate values.

22.2.2.3. Picklist Table

The picklist table contains all the values for each enumeration in the schema. Each enumeration type is represented as a separate picklist domain associated with a picklist value. enumeration types are also represented as tables to which the picklist table refers. Listing 22.4 shows how to create a picklist domain and picklist table and create an alternate key on the picklist table to prevent duplicate values. Listing 22.5 shows the SQL statements that insert the initial values into the PicklistDomain and Picklist tables.

Listing 22.4. Create the PicklistDomain and Picklist Tables
CREATE TABLE PicklistDomain ( 
picklistDomainID NUMBER(3) NOT NULL, 
name    VARCHAR2(30) NOT NULL, 
description  VARCHAR2(255) NULL, 
PRIMARY KEY(picklistDomainID)); 

CREATE TABLE Picklist ( 
picklistID   NUMBER(4) NOT NULL,  
picklistDomainID NUMBER(3) NOT NULL, 
name    VARCHAR2(30) NOT NULL, 
description  VARCHAR2(255) NULL, 
PRIMARY KEY(picklistID), 
FOREIGN KEY (picklistDomainID) REFERENCES PicklistDomain); 

CREATE UNIQUE INDEX XAK1Picklist ON Picklist 
( 
picklistDomainID, name 
); 

Listing 22.5. Insert the Initial Values into the PicklistDomain and Picklist Tables
INSERT INTO PicklistDomain 
(picklistDomainID, name, description) 
VALUES 
(1, 'ColorOptionType', 
'There is a limited selection of color choices. As defined, [ccc] 
*any* part could have any one of these colors.'), 

INSERT INTO Picklist 
(picklistID, picklistDomainID, name) 
VALUES 
(1, 1, 'cyan'), 

INSERT INTO Picklist 
(picklistID, picklistDomainID, name) 
VALUES 
(2, 1, ' magenta'), 

INSERT INTO Picklist 
(picklistID, picklistDomainID, name) 
VALUES 
(3, 1, ' yellow'), 

INSERT INTO Picklist 
(picklistID, picklistDomainID, name) 
VALUES 
(4, 1, 'black'), 
COMMIT; 

In Listing 22.6, an example table is created to use the Picklist table, and two triggers are created to enforce referential integrity programmatically. Before-insert and before-update triggers are used to prevent the data from being inserted or updated when an error occurs.

Listing 22.6. Example Table using Picklists and Triggers
CREATE TABLE PicklistTableExample ( 
value NUMBER(4) NOT NULL 
); 

CREATE OR REPLACE TRIGGER InsertPicklistTableExample 
BEFORE INSERT 
ON PicklistTableExample 
FOR EACH ROW 
DECLARE 
 I PLS_INTEGER; 
BEGIN 
 I := 0; 
 SELECT COUNT(1) 
 INTO I 
 FROM Picklist 
 WHERE picklistDomainID = 1 AND -- 1 = colorOptionType 
     picklistID = :new.value; 

 IF I = 0 THEN 
   RAISE_APPLICATION_ERROR(-20000, 
               'Not good colorOptionType'), 
 END IF; 
END; 

CREATE OR REPLACE TRIGGER UpdatePicklistTableExample 
BEFORE UPDATE 
ON PicklistTableExample 
FOR EACH ROW 
DECLARE 
 I PLS_INTEGER; 
BEGIN 
 I := 0; 
 SELECT COUNT(1) 
 INTO I 
 FROM Picklist 
 WHERE picklistDomainID = 1 AND -- 1 = colorOptionType 
     picklistID = :new.value; 
 IF I = 0 THEN 
   RAISE_APPLICATION_ERROR(-20000, 
               'Not good colorOptionType'), 
 END IF; 
END; 

Note

Triggers for picklists are frequently not created. Because a picklist is most useful when there are numerous enumerations, this would cause a proliferation of triggers. If they are created, they are sometimes used in development and dropped or disabled in the test and production environments. Essentially, the triggers mimic referential integrity but at a much greater run-time cost.


22.2.3. whiteSpace Constraining Facet

The whiteSpace constraining facet is implicitly specified for most datatypes and cannot be changed by the XML schema designer. Whitespace is collapsed for all non-string datatypes as well as list datatypes (that is, IDREFS, ENTITIES, and NMTOKENS). It is always preserved for the string datatype and can be preserved, replaced, or collapsed for all datatypes derived by restriction from the string datatype. Whitespace is always REPLACE for the normalized string datatype, and can be REPLACE or COLLAPSE for all datatypes derived by restriction from the normalizedString datatype.

Database designers need to understand whether an XML instance will contain extra space that needs to be collapsed automatically. For most datatypes, this is inconsequential because mapping a decimal datatype to a database NUMBER column datatype automatically removes extraneous whitespace. However, for XML Schema datatypes that are mapped to a database character column datatype, this becomes important.

When whitespace is to be replaced, the tab, line feed, and carriage return characters need to be replaced by a space. The following SQL fragment can accomplish this:

TRANSLATE(?replaceWhitespaceValue, 
    UNISTR('009') || UNISTR('00A') || UNISTR('00D'), 
    ' ') 

where the ?replaceWhitespaceValue bind variable is replaced by the appropriate string.

When whitespace is to be collapsed, the tab, line feed, and carriage return characters need to be replaced by a space. The following SQL fragment can accomplish this:

REPLACE(TRIM(TRANSLATE(?collapseWhitespaceValue, 
           UNISTR('009') || UNISTR('00A') || 
           UNISTR('00D'), ' ')), 
   ' ', ' ') 

where the ?collapseWhitespaceValue bind variable is replaced by the appropriate string.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.161.49