Chapter 3
IN THIS CHAPTER
Comparing languages
Seeing how hard it is (sometimes) to get languages to work together
Embedding SQL in a procedural program
Using SQL modules
SQL was conceived and implemented with one objective in mind: to create and maintain a structure for data to be stored in a relational database. It was never intended to be a complete language that you could use to create application programs. Application programming was — and is — the domain of procedural languages such as C, C++, C#, Java, Python, and Visual Basic.
Clearly, a need exists for application programs that deal with databases. Such programs require a combination of the features of a procedural language such as C and a data sublanguage such as SQL. Fundamental differences between the architectures and philosophies of procedural languages and of SQL make combining them a challenge.
In this chapter, I look at the characteristics of — and differences between — those two very different worlds.
First, I look at SQL, which is strong in data retrieval. If important information is buried somewhere in a single-table or multitable database, SQL gives you the tools you need to retrieve it. You don’t need to know the order of the table’s rows or columns, because SQL doesn’t deal with rows or columns individually. The SQL transaction-processing facilities also ensure that your database operations are unaffected by any other users who may be simultaneously accessing the same tables that you are — another plus.
A major weakness of SQL is its rudimentary user interface. It has no provision for formatting screens or reports. It accepts command lines from the keyboard and sends retrieved values to the terminal one row at a time.
Sometimes, a strength in one context is a weakness in another. One strength of SQL is that it can operate on an entire table at the same time. Whether the table has 1 row, 100 rows, or 100,000 rows, a single
statement can extract the data you want. SQL can’t easily operate on one row of a multi-row table at a time, however, and sometimes, you do want to deal with each row individually. In such cases, you can use SQL’s cursor facility (described in Book 3, Chapter 5), or you can use a procedural host language.SELECT
Speaking of procedural host languages, what are their strengths and weaknesses? In contrast to SQL, procedural languages are designed for one-row-at-a-time operation, which allows the application developer precise control of the way a table is processed. This detailed control is a great strength of procedural languages. A corresponding weakness, however, is the fact that the application developer must have detailed knowledge of the way data is stored in the database tables. The order of the database’s columns and rows is significant and must be taken into account.
Because of the step-by-step nature of procedural languages, they have the flexibility to produce user-friendly screens for data entry and viewing. You can also produce sophisticated printed reports with any desired layout.
Classic procedural languages are the first languages used to program computers and their descendants. The very first languages were machine languages, in which both instructions and data were represented as ones and zeros. Digital computers are binary machines, and ones and zeros are the only things they understand. Unfortunately, long sequences of ones and zeros aren’t particularly easy for humans to understand, so it wasn’t long before machine language was superseded — first by assembly language and then by compiled high-level languages such as Fortran and COBOL. C and Basic are examples of more recent classic procedural languages.
Classic procedural languages such as C and Basic are complete programming languages. They can implement any procedure that can be represented in algorithmic form. They operate primarily by executing one command after another in sequence, although they also have flow of control structures that enable them to branch, either unconditionally or depending on the value of a condition. They also support loops, which enable a program to execute a section of code repeatedly. SQL, as defined by the SQL-92 international standard, didn’t have these capabilities. Additions to the standard that became part of SQL:1999 have added some of these capabilities, but not all implementations of SQL have been upgraded yet to support them.
Object-oriented programming, the first incarnation of which was Simula-67 in 1967, came into its own in the 1990s, when it became the predominant programming paradigm. Large, complex software projects that would have been very difficult to build with one of the classical procedural languages were easier to accomplish with one or another of the object-oriented languages, such as C++, Java, C#, Python, or Visual Basic.NET.
The fundamental unit of a program written in an object-oriented language is the object, whereas the instruction is the fundamental unit of a classic procedural language program. Each object in an object-oriented program can receive messages sent by other objects, process data, and send messages to other objects.
Object-oriented code is intrinsically modular, which makes object-oriented programs easier to develop, understand, and maintain than programs generated according to the earlier classic paradigm. Objects are instances of classes. A class has associated attributes and methods. Attributes are characteristics of a class, and methods are actions that members of the class can perform.
SQL is an example of a nonprocedural language. Rather than deal with the data in a table one row at a time, it deals with data a set at a time, which means that a query may return a result set containing multiple rows. By contrast, procedural languages (both classic and object-oriented) process tables one row at a time and return data to the application the same way: one row at a time.
In the early days of relational databases, other nonprocedural languages competed with SQL. Among these competitors were QUEL and RDML. QUEL was the data sublanguage of the Ingres database management system (DBMS) that was developed at the University of California at Berkeley and later commercialized. Now it’s sold as an open source product by Ingres Corp. Due to the overwhelming acceptance of SQL in the marketplace, SQL syntax has been added to QUEL. RDML is the data sublanguage for Digital Equipment Corp.’s Rdb relational database products. Alas, both Rdb and Digital Equipment itself have passed into history.
Although SQL was developed by IBM, it was adopted at a very early stage by the company that was to become Oracle Corp. Other DBMS vendors followed suit, and SQL became a de facto standard that was codified into a recognized official standard in 1986. SQL didn’t beat out QUEL and RDML because of its technical superiority, which was debatable. It won because of IBM’s market clout and because Oracle’s early adoption of it started a domino effect of DBMS vendors joining the club and supporting SQL.
Any time the database and the programming language addressing it use different data models, problems are going to arise. Beyond data models, differences in data types add to the problems. SQL’s data types don’t match the data types of any of the languages that try to communicate with SQL databases. Despite these challenges, SQL and procedural languages must be made to work together, because neither by itself can do the complete job.
It makes sense to try to combine SQL and procedural languages in such a way that you can benefit from their combined strengths and not be penalized by their combined weaknesses. As valuable as such a combination may be, some challenges must be overcome before it can be achieved practically.
A big problem in combining SQL with a procedural language is that SQL operates on tables a set at a time, whereas procedural languages work on them a row at a time. Sometimes, this difference isn’t a big deal. You can separate set operations from row operations, doing each with the appropriate tool. If, however, you want to search a table for records that meet certain conditions and perform different operations on the records depending on whether they meet the conditions, you may have a problem. Such a process requires both the retrieval power of SQL and the branching capability of a procedural language. Embedded SQL gives you this combination of capabilities by enabling you to embed SQL statements at strategic locations within a program that you’ve written in a conventional procedural language. Other solutions to this problem include proprietary application programming interfaces (APIs) and the use of module language. Module language, described later in this chapter in the section titled “Using SQL Modules with an Application,” puts SQL statements in a separate module instead of embedding them in the procedural language code.
Another hurdle to the smooth integration of SQL with any procedural language is that SQL’s data types are different from those of all major procedural languages. This circumstance shouldn’t be surprising, because the data types defined for any procedural language are different from the types for the other procedural languages. No standardization of data types exists across languages. In releases of SQL before SQL-92, data type incompatibility was a major concern. In SQL-92 (and also in subsequent releases of the SQL standard), the CAST
statement addresses the problem. Book 3, Chapter 1 explains how you can use CAST
to convert a data item from the procedural language’s data type to one that SQL recognizes, as long as the data item itself is compatible with the new data type.
The challenges mentioned in the preceding section with regard to using SQL with classic procedural languages apply equally to using SQL with object-oriented procedural languages. Added to those challenges are incompatibilities often called the impedance mismatch between SQL and object-oriented languages.
The original context of the term impedance mismatch comes from electrical engineering. Different parts of an electrical circuit may have different impedance values, and connecting two such circuit elements can cause problems. As a simple example, suppose that an audio speaker with an intrinsic impedance of 8 ohms is connected to a line with a 50-ohm impedance value. The result will sound attenuated, distorted, and noisy. In the context of SQL and object-oriented procedural languages, similar problems occur. The row and column organization of relational tables doesn’t mesh well with the hierarchical class/object paradigm of object-oriented programming.
Database vendors have addressed the impedance-mismatch problem by adding object-oriented features to their relational database products, turning their hybrid products into object-relational DBMSs. The object-oriented features added to such products as Db2, Oracle, and SQL Server were codified in the SQL:1999 international standard.
SQL:1999 notwithstanding, the marriage of SQL with object-oriented languages such as C++, C#, and Visual Basic.NET isn’t a perfect one. Difficulties remain, but these difficulties are manageable.
In the past, the most common method of mixing SQL with procedural languages was embedded SQL. The name is descriptive: SQL statements are dropped into the middle of a procedural program, wherever they’re needed. Support for embedded SQL, however, has recently been deprecated or dropped completely by one database vendor after another.
Microsoft deprecated embedded SQL for C in SQL Server 2008, although COBOL-IT currently distributes an embedded SQL precompiler for its open source COBOL implementation, designed to work with SQL Server. SAP has discontinued embedded SQL support for its SQL Anywhere product. MySQL has never supported embedded SQL, but COBOL-IT is distributing an embedded SQL precompiler designed to work with MySQL.
PostgreSQL does support embedded SQL for C and C++, and COBOL-IT distributes a COBOL precompiler for PostgreSQL. IBM Db2 still supports embedded SQL for C, C++, and COBOL. Oracle is probably the biggest remaining supporter of embedded SQL.
It may seem like everybody is in the process of either deprecating or outright abandoning embedded SQL. It’s not quite that bad. There is still quite a lot of support for embedded SQL out there, as I describe in the following sections. For those implementations that don’t offer embedded SQL, there are alternatives that accomplish the same thing. Some pass the SQL statements to the database as strings and others encapsulate the SQL statements in modules. Both of these methods are described later in this chapter.
As an example of embedded SQL, look at a program written in Oracle’s Pro*C version of the C language. The program, which accesses a company’s employee table, prompts the user for an employee name and then displays that employee’s salary and commission. Then it prompts the user for new salary and commission data, and updates the employee table with it, as follows:
EXEC SQL BEGIN DECLARE SECTION;
VARCHAR uid[20];
VARCHAR pwd[20];
VARCHAR ename[10];
FLOAT salary, comm;
SHORT salary_ind, comm_ind;
EXEC SQL END DECLARE SECTION;
main()
{
int sret; /* scanf return code */
/* Log in */
strcpy(uid.arr,"FRED"); /* copy the user name */
uid.len=strlen(uid.arr);
strcpy(pwd.arr,"TOWER"); /* copy the password */
pwd.len=strlen(pwd.arr);
EXEC SQL WHENEVER SQLERROR STOP;
EXEC SQL WHENEVER NOT FOUND STOP;
EXEC SQL CONNECT :uid;
printf("Connected to user: percents
",uid.arr);
printf("Enter employee name to update: ");
scanf("percents",ename.arr);
ename.len=strlen(ename.arr);
EXEC SQL SELECT SALARY,COMM INTO :salary,:comm
FROM EMPLOY
WHERE ENAME=:ename;
printf("Employee: percents salary: percent6.2f comm: percent6.2f
",
ename.arr, salary, comm);
printf("Enter new salary: ");
sret=scanf("percentf",&salary);
salary_ind = 0;
if (sret == EOF !! sret == 0) /* set indicator */
salary_ind =-1; /* Set indicator for NULL */
printf("Enter new commission: ");
sret=scanf("percentf",&comm);
comm_ind = 0; /* set indicator */
if (sret == EOF !! sret == 0)
comm_ind=-1; /* Set indicator for NULL */
EXEC SQL UPDATE EMPLOY
SET SALARY=:salary:salary_ind
SET COMM=:comm:comm_ind
WHERE ENAME=:ename;
printf("Employee percents updated.
",ename.arr);
EXEC SQL COMMIT WORK;
exit(0);
}
You don’t have to be an expert in C to understand the essence of what this program is doing and how the program does it. Here’s a rundown of the order in which the statements execute:
SELECT
statement retrieves the named employee’s salary and commission data, and stores it in the host variables :salary
and :comm
.EXEC SQL BEGIN DECLARE SECTION ;
The end of the declaration segment is signaled by the following:
EXEC SQL END DECLARE SECTION ;
When you use embedded SQL with Pro*C, every SQL statement must be preceded by an EXEC SQL
directive. The end of an SQL segment may or may not be signaled by a terminator directive. In COBOL, the terminator directive is END-EXEC
; in FORTRAN, it’s the end of a line; and in Ada, C, Pascal, and PL/I, it’s a semicolon (;
).
Depending on the compatibility of the data types supported by the host language and those supported by SQL, you may have to use CAST
to convert certain types. You can use host variables that have been declared in the DECLARE SECTION
. Remember to prefix host variable names with a colon (:
) when you use them in SQL statements, as in the following example:
EXEC SQL INSERT INTO ENGINES
(EngineName, Displacement, Horsepower, Cylinders, Valves)
VALUES
(:engname, :cid, :hp, :cyl, :valves) ;
SQLJ is the tool to use to embed SQL in a Java program. The process is similar to the way that SQL statements are embedded in an Oracle Pro*C application but with a slight syntactical difference. Here’s an example:
#sql (INSERT INTO ENGINES
(EngineName, Displacement, Horsepower, Cylinders, Valves)
VALUES
(:engname, :cid, :hp, :cyl, :valves)) ;
#sql
, rather than EXEC SQL
, is the signal to the preprocessor that what follows is an SQL statement.
In a Perl application, the SQL statement is passed to the DBMS as a string rather than as an embedded executable statement in the Perl code, as follows:
my $sql = "INSERT INTO ENGINES
(EngineName,Displacement,Horsepower,Cylinders, Valves) " . "values('$engname','$cid','$hp','$cyl','valves')";
print "SQL => $sql
" if $DEBUG;
my $sth = $dbh->prepare($sql);
$sth->execute();
This code uses the Perl database interface (DBI), which I mention in Book 5, Chapter 2. If an error is encountered, the offending SQL statement is printed out. If no error is detected, the SQL statement is prepared and then the last line actually executes it.
Once again, with PHP and a MySQL database, the operation is basically the same as with Perl, except that the syntax has been changed to protect the innocent, as in this example:
$query = "INSERT INTO ENGINES
(EngineName, Displacement, Horsepower, Cylinders, Valves)
VALUES
('engname', 'cid', 'hp', 'cyl', 'valves')" ;
mysql_query($query) or die('Error, insert query failed');
The last line checks whether the insert was performed successfully. If not, an error message is displayed.
Unlike Oracle’s Pro*C, and Java, but like Perl and PHP, Visual Basic .NET doesn’t support embedded SQL. Instead, it passes a string containing the SQL statement to the ADO.NET data provider to accomplish the same effect as embedded SQL.
Here’s an example of an SQL operation as Oracle’s Pro*C would do it with embedded SQL, followed by the Visual Basic .NET equivalent using ADO.NET. First, here’s the Pro*C:
EXEC SQL UPDATE VENDOR
SET VendorName = :vendorname
WHERE VendorID = 'PENGUIN';
Here’s the ADO.NET equivalent:
Dim strSQL As String
strSQL = "UPDATE VENDOR SET VendorName = @vendorname "& _
"WHERE VendorID = 'PENGUIN'"
Dim cmd As New SqlCommand(strSQL, cn)
Dim par As SqlParameter
Par = cmd.Parameters.Add("@vendorname",SqlDbType.VarChar, 10)
Par.Value = "VendorName"
Dim InsertRecordsAffected As Integer = cmd.ExecuteNonQuery()
ADO.NET is a library of data access procedures in the .NET Framework.
All .NET languages other than Visual Basic .NET — C#, C++.NET, COBOL.NET, Perl.NET, and so on — use ADO.NET in the same way that Visual Basic .NET does to provide data access to relational databases. ADO.NET eliminates the need to embed SQL code within a procedural application program.
Module language provides another method of using SQL with a procedural programming language. With module language, you explicitly put all the SQL statements in a separate SQL module.
Each SQL procedure contains only one SQL statement. In the host program, you explicitly call an SQL procedure at whatever point in the host program you want to execute the SQL statement in that procedure. You call the SQL procedure as though it were a host language subprogram.
Thus, an SQL module and the associated host program are essentially a way of explicitly doing what the SQL preprocessor for embedded syntax does.
Module language does have several advantages:
Once again, what can be looked at as an advantage from one perspective may be a disadvantage from another. Because the SQL modules are separated from the procedural code, following the flow of the logic isn’t as easy as it is in embedded SQL when you’re trying to understand how the program works.
The syntax for the declarations in a module is as follows:
MODULE [module-name]
[NAMES ARE character-set-name]
LANGUAGE {ADA|C|COBOL|FORTRAN|MUMPS|PASCAL|PLI|SQL|JAVA}
[SCHEMA schema-name]
[AUTHORIZATION authorization-id]
[temporary-table-declarations…]
[cursor-declarations…]
[dynamic-cursor-declarations…]
procedures…
As indicated by the square brackets, the module name is optional. Naming it anyway is a good idea, however, to help keep things from getting too confusing.
The optional NAMES ARE
clause specifies a character set. If you don’t include a NAMES ARE
clause, the default set of SQL characters for your implementation is used.
The LANGUAGE
clause tells the module which language it will be called from. The compiler must know what the calling language is, because it will make the SQL statements appear to the calling program to be subprograms in that program’s language.
Although both the SCHEMA
clause and the AUTHORIZATION
clause are optional, you must specify at least one of them, or you can specify both. The SCHEMA
clause specifies the default schema, and the AUTHORIZATION
clause specifies the authorization identifier. The authorization identifier establishes the privileges you have. If you don’t specify an authorization ID, the DBMS uses the authorization ID associated with your session to determine the privileges your module is allowed. If you don’t have the privilege to perform the operation your procedure calls for, your procedure isn’t executed.
Finally, after all these declarations, the functional parts of the module are the procedures. An SQL module language procedure has a name, parameter declarations, and executable SQL statements. The procedural language program calls the procedure by its name and passes values to it through the declared parameters. Procedure syntax is as follows:
PROCEDURE procedure-name
(parameter-declaration [, parameter-declaration ]… )
SQL statement ;
[SQL statements] ;
The parameter declaration should take the following form:
parameter-name data-type
or
SQLSTATE
The parameters you declare may be input parameters, output parameters, or both. SQLSTATE
is a status parameter through which errors are reported. SQLSTATE
is covered extensively in Book 4, Chapter 4.
Oracle’s implementation of module language, named SQL*Module, is specifically designed to overcome the impedance mismatch (refer to “Challenges of using SQL with an object-oriented procedural language,” earlier in this chapter) between SQL and application programs written in the Ada programming language. SQL*Module compiles SQL module language files. A module language file contains parameterized procedures that encapsulate SQL statements. The SQL*Module compiler translates these procedures into calls to the SQL runtime library on the Oracle server. All the SQL code resides in a separate module. SQL*Module defines the interface between the SQL module and the host program written in Ada.
A module is composed of three parts:
The SQL code that you can put in a module is somewhat restricted. Statements that aren’t supported by SQL*Module include the following:
SELECT
, UPDATE
, DELETE
, and INSERT
COMMIT
, ROLLBACK
, CONNECT
, and DISCONNECT
The fact that SQL*Module doesn’t support DDL statements means that you can’t create database objects such as tables with SQL*Module. You can’t even alter the structure of an existing table. Thus, the database must exist and be in its final form before you try to operate on it with an application program that uses SQL*Module.
3.135.217.228