Chapter 3

Getting to Know SQL

IN THIS CHAPTER

Bullet Seeing where SQL came from

Bullet Seeing what SQL does

Bullet Looking at the ISO/IECSQL standard

Bullet Seeing what SQL doesn’t do

Bullet Examining your SQL implementation options

In the early days of relational database management systems (RDBMS), there was no standard language for performing relational operations on data. (If you aren’t sure what an RDBMS is, please take a look at the first chapter in this book.) A number of companies came out with relational database management system products, and each had its own associated language. There were some general similarities among the languages because they all performed essentially the same operations on the same kinds of data, structured in the same way. However, differences in syntax and functionality made it impossible for a person using the language of one RDBMS to operate on data that had been stored by another relational database management system. (That’s RDBMS, if you missed it the first time.) All the RDBMS vendors tried to gain dominant market share so that their particular proprietary language would prevail. The logic was that once developers learned a language, they would want to stick with it on subsequent projects. This steaming cauldron of ideas set the stage for the emergence of SQL. There was one company (IBM) that had more market power than all the others combined, and it had the additional advantage of being the employer of the inventor of the relational database model.

Where SQL Came From

It is interesting to note that even though Dr. Codd was an IBM employee when he developed the relational database model, IBM’s initial support of that model was lukewarm at best. One reason might have been the fact that IBM already had a leading position in the database market with its IMS (Information Management System) hierarchical DBMS. (For the whole hierarchical versus relational divide, check out Book 1, Chapter 1.) In 1978, IBM released System/38, a minicomputer that came with an RDBMS that was not promoted heavily. As a result, in 1979, the world was introduced to a fully realized RDBMS by a small startup company named Relational Software, Inc. headed by Larry Ellison. Relational’s product, called Oracle, is still the leading relational database management system on the market today.

Although Oracle had the initial impact on the market, other companies, including IBM, quickly followed suit. In the process of developing its SQL/DS relational database management system product, IBM created a language, code-named SEQUEL, which was an acronym for Structured English Query Language. This moniker was appropriate because SEQUEL statements looked like English-language sentences, but were more structured than most casual speech.

When it came time for IBM to actually release its RDBMS product, along with its associated language, IBM’s legal department flagged a possible copyright issue with the name SEQUEL. In response, management elected to drop the vowels and call the language SQL (pronounced ess cue el). The reference to structured English was lost in the process. As a result, many people thought that SQL was an acronym for Structured Query Language. This is not the case. In computer programming, a structured language has some very well-defined characteristics. SQL does not share those characteristics and is thus not a structured language, query or otherwise.

Knowing What SQL Does

SQL is a software tool designed to deal with relational database data. It does far more than just execute queries. Yes, of course you can use it to retrieve the data you want from a database, using a query. However, you can also use SQL to create and destroy databases, as well as modify their structure. In addition, you can add, modify, and delete data with SQL. Even with all that capability, SQL is still considered only a data sublanguage, which means that it does not have all the features of general-purpose programming languages such as C, C++, C#, or Java.

SQL is specifically designed for dealing with relational databases, and thus does not include a number of features needed for creating useful application programs. As a result, to create a complete application — one that handles queries as well as provides access to a database — you must write the code in one of the general-purpose languages and embed SQL statements within the program whenever it communicates with the database.

The ISO/IEC SQL Standard

In the early 1980s, IBM started using SQL in its first relational database product, which was incorporated into the System/38 minicomputer. Smaller companies in the DBMS industry, in an effort to be compatible with IBM’s offering, modeled their languages after SQL. In this way, SQL became a de facto standard. In 1986, the de facto standard became a standard de jure when the American National Standards Institute (ANSI) issued the SQL-86 standard. The SQL standard has been continually updated since then, with subsequent revisions named SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2008, SQL:2011, and SQL:2016. Along the way, the standard became accepted internationally and became an ISO/IEC standard, where ISO is the International Organization for Standardization, and IEC is the International Electrotechnical Commission. The internationalization of the SQL standard means that database developers all over the world talk to their databases in the same way.

Knowing What SQL Does Not Do

Before I can tell you what SQL doesn’t do, I need to give you some background information. In the 1930s, computer scientist and mathematician Alan Turing defined a very simple machine that could perform any computation that could be performed by any computer imaginable, regardless of how big and complex. This simple machine has come to be known as a universal Turing machine. Any computer that can be shown to be equivalent to a universal Turing machine is said to be Turing-complete. All modern computers are Turing-complete. Similarly, a computer language capable of expressing any possible computation is said to be Turing-complete. Practically all popular languages, including C, C#, C++, BASIC, FORTRAN, COBOL, Pascal, Java, and many others, are Turing-complete. SQL, however, is not.

Note: Whereas ISO/IEC standard SQL is not Turing-complete, DBMS vendors have added extensions to their versions which are Turing complete. Thus the version of SQL that you are working with may or may not be Turing-complete. If it is, you can write a whole program with it, without embedding your SQL code in a program written in another language.

Because standard SQL is not Turing-complete, you cannot write an SQL program to perform a complex series of steps, as you can with a language such as C or Java. On the other hand, languages such as C and Java do not have the data-manipulation facilities that SQL has, so you cannot write a program with them that will efficiently operate on database data. There are several ways to solve this dilemma:

  • Combine the two types of language by embedding SQL statements within a program written in a host language such as C. (I discuss this in Book 5, Chapter 3.)
  • Have the C program make calls to SQL modules to perform data-manipulation functions. (I talk about this in Book 5, Chapter 3 as well.)
  • Create a new language that includes SQL, but also incorporates those structures that would make the language Turing-complete. (This is essentially what Microsoft and Oracle have done with their versions of SQL.)

All three of these solutions are offered by one or another of the DBMS vendors.

Choosing and Using an Available DBMS Implementation

A number of relational database management systems are currently available, and they all include a version of SQL that adheres, more or less, closely to the ISO/IEC international standard for the SQL language. No SQL version available today is completely compliant with the standard, and probably none ever will be. The standard is updated every few years, adding new capability, putting the vendors in the position of always playing catch-up.

In addition, in most cases, the vendors do not want to be 100 percent compliant with the standard. They like to include useful features that are not in the standard in order to make their product more attractive to developers. If a developer uses a vendor’s nonstandard feature, this has the effect of locking in the developer to that vendor. It makes it difficult for the developer to switch to a different DBMS.

Microsoft Access

Microsoft Access is an entry-level DBMS with which developers can build relatively small and simple databases and database applications. It is designed for use by people with little or no training in database theory. You can build databases and database applications using Access, without ever seeing SQL.

Access does include an implementation of SQL, and you can use it to query your databases — but it is a limited subset of the language, and Microsoft does not encourage its use. Instead, they prefer that you use the graphical database creation and manipulation tools and use the query-by-example (QBE) interface to ask questions of your database. Under the hood and beyond user control, the table-creation tasks that the user specifies using the graphical tools are translated to SQL before being sent to the database engine, which is the part of the DBMS that actually operates on the database.

Microsoft Access runs under any of the Microsoft Windows operating systems, as well as Apple’s OS X, but not under Linux or any other non-Microsoft operating system.

To reach the SQL editor in Access, do the following:

  1. Open a database that already has tables and at least one query defined.

    You see a database window that looks something like Figure 3-1, with the default Home tab visible. The icon at the left end of the ribbon, sporting the pencil, ruler, and draftsman’s triangle, is the icon for Design View, one of several available views. In this example, the pane on the left side of the window sports a Queries heading and several queries are listed below it.

  2. (Optional) If Queries are not listed in the pane on the left, click on the downward-pointing arrow in the pane’s heading and select Queries from the drop-down menu that appears.
  3. Select one of the displayed Queries.

    I have selected, for example, Team Membership of Paper Authors.

  4. Right-click the selected query.

    Doing so displays the menu shown in Figure 3-2. This menu lists all the things you can do with the query you have chosen.

  5. Choose Open from the displayed menu.

    This executes the query and displays the result in the right-hand pane, as shown in Figure 3-3. The result is in Datasheet View, which looks very much like a spreadsheet.

  6. Pull down the Views menu by clicking on the word View (right there below the pencil, ruler, and triangle icon).

    Figure 3-4 shows the result.

  7. Choose SQL View from the View drop-down menu.

    Doing so shows the view displayed in Figure 3-5. It is the SQL code generated in order to display the result of the Team Membership of Paper Authors query.

    As you can see, it took a pretty complicated SQL statement to perform that Team Membership query.

    This early in the book, and I know many of you do not know any SQL yet. However, suppose you did. (Not an unfounded supposition, by the way, because you certainly will know a lot about SQL by the time you’ve finished reading this book.) On that future day, when you are a true SQL master, you may want to enter a query directly using SQL, instead of going through the extra stage of using Access’ Query by Example facility. Once you get to the SQL Editor, which is where we are right now, you can do just that. Step 8 shows you how.

  8. Delete the SQL code currently in the SQL Editor pane and replace it with the query you want to execute.

    For example, suppose you wanted to display all the rows and columns of the PAPERS table. The following SQL statement will do the trick:

    SELECT * FROM PAPERS ;

    Figure 3-6 shows the work surface at this point.

  9. Execute the SQL statement that you just entered, by clicking on the big red exclamation point in the ribbon that says Run.

    Doing so produces the result shown in Figure 3-7, back in Datasheet View.

Screenshot of A Microsoft Access 2016 database window.

FIGURE 3-1: A Microsoft Access 2016 database window.

Screenshot of menu of possible actions for the query selected.

FIGURE 3-2: Menu of possible actions for the query selected.

Screenshot of result of team membership of paper authors query.

FIGURE 3-3: Result of Team Membership of Paper Authors query.

Screenshot of the pulling down of the Views menu.

FIGURE 3-4: The Views menu has been pulled down.

Screenshot of the SQL Editor window for the Team Membership of Paper Authors query.

FIGURE 3-5: The SQL Editor window, showing SQL for the Team Membership of Paper Authors query.

Screenshot of the queries in the PAPERS table.

FIGURE 3-6: The query to select everything in the PAPERS table.

Screenshot of the result of the query to select everything in the PAPERS table.

FIGURE 3-7: The result of the query to select everything in the PAPERS table.

Microsoft SQL Server

Microsoft SQL Server is Microsoft’s entry into the enterprise database market. It runs only under one of the various Microsoft Windows operating systems. The latest version is SQL Server 2017. Unlike Microsoft Access, SQL Server requires a high level of expertise in order to use it at all. Users interact with SQL Server using Transact-SQL, also known as T-SQL. It adheres quite closely to the syntax of ISO/IEC standard SQL and provides much of the functionality described in the standard. Additional functionality, not specified in the ISO/IEC standard, provides the developer with usability and performance advantages that Microsoft hopes will make SQL Server more attractive than its competitors. There is a free version of SQL Server 2017, called SQL Server 2017 Express Edition, that you might think of as SQL Server on training wheels. It is fully functional, but the size of database it can operate on is limited.

IBM DB2

DB2 is a flexible product that runs on Windows and Linux PCs, on the low end all the way up to IBM’s largest mainframes. As you would expect for a DBMS that runs on big iron, it is a full-featured product. It incorporates key features specified by the SQL standard, as well as numerous nonstandard additions. As with Microsoft’s SQL Server, to use DB2 effectively, a developer must have received extensive training and considerable hands-on experience.

Oracle Database

Oracle Database is another DBMS that runs on PCs running the Windows, Linux, or Mac OS X operating system, and also on very large, powerful computers. Oracle SQL is highly compliant with SQL:2016.

SQL Developer is a free graphical tool that developers can use to enter and debug Oracle SQL code.

A free version of Oracle, called Oracle Database 18c Express Edition, is available for download from the Oracle website (www.oracle.com). It provides a convenient environment for learning Oracle. Migration to the full Oracle Database 11g product is smooth and easy when you are ready to move into production mode. The enterprise-class edition of Oracle hosts some of the largest databases in use today. (The same can be said for DB2 and SQL Server.)

Sybase SQL Anywhere

Sybase’s SQL Anywhere is a high-capacity, high-performance DBMS compatible with databases originally built with Microsoft SQL Server, IBM DB2, Oracle, and MySQL, as well as a wide variety of popular application-development languages. It features a self-tuning query optimizer and dynamic cache sizing.

Remember Tuning queries can make a big difference in their execution time. Tuning a query means making adjustments to it to make it run faster. Dynamic cache sizing means changing the size of the cache memory available to a query, based on the resources that the query needs to run as fast as possible. I talk about query tuning in Chapter 2 of Book 3.

MySQL

MySQL is the most widely used open source DBMS. The defining feature of open source software is that it is freely available to anyone. After downloading it you can modify it to meet your needs, and even redistribute it, as long as you give attribution to its source.

There are four different versions of MySQL, each with a different database engine and different capabilities. The most feature-rich of these is MySQL InnoDB. People often use one or another of the MySQL versions as the back ends for a large number of data-driven websites. The level of compliance with the ISO/IEC SQL standard differs between versions, but the compliance of MySQL InnoDB is comparable to that of the proprietary DBMS products mentioned here.

MySQL is particularly noted for its speed. It runs under Windows and Linux, but not under IBM’s proprietary mainframe operating systems. MySQL is supported by a large and dedicated user community, which you can learn about at www.mysql.com. MySQL was originally developed by a small team of programmers in Finland, and was expanded and enhanced by volunteer programmers from around the world. Today, however, it is owned by Oracle Corporation.

PostgreSQL

PostgreSQL (pronounced POST gress CUE el) is another open source DBMS, and it is generally considered to be more robust than MySQL, and more capable of supporting large enterprise-wide applications. It is also supported by an active user community. PostgreSQL runs under Linux, Unix, Windows, and IBM’s z/OS mainframe operating system.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.255.134