Chapter 7. Web Databases

Behind all major Web programming efforts there is usually a database of some kind in which to store persistent information about the site and its content. This might extend to username and password combinations as well as email addresses and such.

More often than not, however, it is simply used in content-management systems, to store the text and formatting requirements for the content. This usually includes categories and tags to describe each article (or blog entry), as well as the hierarchy of pages and links between them. Using a database makes it much easier to effect changes, as you’ll see.

Games, Web shops, blogs, and everything else that needs to store persistent data will likely do so using a back-end database. Prices for shops, comments for blogs, and the game state and data that describes the current game environment will also be contained in databases. The reason for this is simple: it makes it very easy to manipulate the data, both from the point of view of the setup and maintenance, and also for changing the data within Web programs (such as those programmed in PHP).

You’re not expected to be a programmer or systems engineer; the people behind database systems make it very easy to get started. You already have something of a grounding in database access, having looked at how this is achieved from PHP in the last chapter.

Additionally, the JavaScript chapter discussed how data that is to be sent can be validated before being put in a database, which is part of the mechanism. Then, again in the PHP chapter, you saw how that data can then be sent to the Web server to generate a response.

Behind all the scripting is the database where all the data is stored, which is a piece of software (in the same way that the Web server is a piece of software), that gives a usable front-end to the collection of data that your Web application needs.

In this chapter, you’ll learn how to:

  • Design simple databases

  • Set up a MySQL database

  • Query the database to retrieve information

  • Insert information

  • Create tables using PHP

It is also useful to know how the database is put together and how to access it without needing scripting via a special interface. In other words, you will learn how to access the database programmatically and non-programmatically.

The reason for knowing the two access methods is that most CMSs (content-management systems) that you’ll look at in the next chapter are based on a database. Sometimes, it’s useful to be able to alter those databases without using actual PHP code, to reduce the programming overhead.

Points to note—the database has its own language to communicate with it, so that’s the fourth language reference that the reader will have at their fingertips. This language is called SQL (Structured Query Language), and can be used for everything from managing the structure of the database to retrieving data from it and inserting or removing data.

The non-programmer will find it easy to follow, while more experienced programmers might find some of it a bit simplistic, but it’s worth starting from simple foundations. You’ll learn more complex techniques, of course, as you get to know the language more deeply. What you will learn here are solid foundations to build that knowledge on—and just enough of Web databases to be able to do something really useful.

Databases for Non-Programmers

A database is a place to store information in a way that can be easily retrieved. Modern systems tend to use something called relational database systems. The key to relational databases is that they are easy to extend, and, as long as you follow some simple guidelines, that extension will not mean changing your code or the underlying database structure.

The term relation comes from the fact that all the data that is stored in the database is structured around some well-defined categories. These are set by the user in advance, so clearly there is a small design component to setting up the database system.

To do that, you need to understand how the database is organized. A database contains a set of tables. Each named table contains a structure that is divided into rows and columns—each column is a category (relation) that defines a piece of data.

Each row in the table contains a unique combination of fields (column data) that describe that row. Data can be inserted, removed, retrieved, and modified based on this relationship among the row, column, and field.

Behind the scenes, the relation dataset system itself offers a lot of management capabilities that maintain the relations between all the pieces of data. You won’t use much of the relational database management directly; instead, you’ll treat the database that is behind the SQL front-end as being structured in the way shown in Figure 7.1.

Database schematic

Figure 7.1. Database schematic

From the schematic in Figure 7.1, you should note the following:

  • A database has multiple tables

  • A table has multiple rows

  • Each row is divided into columns

The database, table, and the columns have names, which is the easiest way to reference them. Rows are typically numbered, but there is never any great need to reference a row directly by number, just that it is the nth row in the current series being viewed, which is usually an interesting subset of all the rows that the programmer wants to look at.

Each of the named columns also has a set of attributes that defines the kind of data that you want to store in there, as well as some system attributes that tell the database what internal settings the columns should have. Without this information, it is not possible for the database system to maintain the relationships correctly between all the pieces of data.

These attributes are often called the domain of possible values—in other words, you might define a column as containing only numerical data, allowing the system to use it for calculations, or you might decide that it should be string data, allowing for full text search.

To help the database system further, the database design usually gives a number of constraints that limit the data value within a specific domain. This allows the programmer to indicate, ahead of time, whether a field is to be unique in the table—the system would reject any attempt to create two rows having the same value for a specific column. This is useful when customer IDs, for example, are used to identify uniquely customers in the system.

It is important to remember that database installations on Web hosts usually limit the number of actual databases that you are allowed to have. This is vital because each kind of add-on (such as a forum, CMS, blog, shopping system, and so on) that the Website includes must have a database all to itself (usually).

Each database, therefore, should refer to a single data domain—one for each kind of interface between the user and the system. This keeps the information conceptually separated, making maintenance easier to manage for the overall system.

Before any data can be processed, however, the databases need to be designed in a way that is logical, easy to access, and efficient, and that maximizes the potential for speedy data extraction when needed. There is nothing worse than a badly designed database for sapping system resources and slowing down the whole site.

Database Design

Before you begin this section in earnest, it’s important to note that there are whole college courses dedicated to the subject of database design and management. Clearly within the confines of the book, and the “Just Enough” philosophy, this section does not cover everything that they do.

The good news is that you don’t need to know everything about database design—Web databases need to be small, efficient, and manageable, so unless you’re setting up a system that is the size of Amazon or Google, the database design, although important, is not the biggest single programming issue.

What is important is that the database be:

  • Easy to reference (for the programmer)

  • Compact and efficient

  • Complete but extendable

The first of these is obvious—the database needs to have logical names for the tables and columns, the data that is stored within each table needs to be grouped logically, and building a piece of code to retrieve that data should not be cumbersome. The more difficult the database is to interface with, the harder it will be to remember how to interface with it.

So, the data structures have to be correct. This will also help each table be compact and efficient. Big tables take longer to access, and the more columns a table has, the more this will become apparent. Modern systems can optimize this to a certain extent, by using indexes and other methods to improve access time, but there is still something to be said for keeping a table compact.

Last but not least, the information has to be complete now, but structured in a way that makes it logically extendable in the future. It is highly unlikely that the requirements of the database will be static through its lifespan, so it is better to design the table with extendibility in mind, while trying to keep the data description as complete as possible.

Essentially, at some point, a line will have to be drawn, and the actual implementation begun; otherwise, an inventive mind can go on refining the data scheme almost indefinitely. This process is known as database design. To borrow from Wikipedia:

“Database design is the process of producing a detailed data model of a database. This logical data model contains all the needed logical and physical design choices and physical storage parameters. . . . A fully attributed data model contains detailed attributes for each entity.”[http://en.wikipedia.org/wiki/Database_design]

So, it’s time to take out a pen and paper (or a text document) and create the database design. Again, there are whole languages and diagramming conventions given over to this topic alone, and this section skims the subject matter lightly. For a start, I gloss over the physical characteristics of the database, because, hopefully, these will not matter and will usually be beyond your control in a Web environment.

However, it is important to understand the attributes of, and relationships between, pieces of data that are stored in the database before creating a schematic of the database design itself.

Understanding Attributes and Relationships

In database design, you hear a lot about attributes (the type of data) and relationships (the position of that data within the structure). The following is a stripped–down conceptual explanation of some terms that will help you design a database.

The exact notation that you choose is, for the purposes of this exercise, entirely a question of convenience. It is better to feel comfortable with the notation than have it forced upon you. However, feel free to use a layout using boxes and lines that helps you to understand what the structure should be.

The simple table design in Figure 7.2 shows a starting point.

Simple table design

Figure 7.2. Simple table design

The Customer table contains a column for Name and Customer ID. The attributes are in brackets and show the kind of data you expect to store in them. The

Order table has a similar layout. For each item there is an attribute in brackets, and each element is contained within its own box.

This simple layout allows you to see quickly what data fields are part of what tables, and this relationship is a has-a relationship. An entry in the Customer table has-a Name and ID, the Order table has-a row containing order details, and so on.

The relationship between the data elements and their attributes is an example of an is-a relationship. A Name is-a string, a Customer ID is-a number. In Figure 7.2, you can also see that there is a relationship between two items, where the tables have them in common—the Customer ID is an index that links the Customer table and the Order table.

This link hints at a relationship between the two tables. In fact, it is an example of a one-to-many relationship, because one row in the Customer table can reference many rows in the Order table, with each one sharing the same Customer ID.

Thus, the relationship between the Order table and the Customer table is the inverse—a many-to-one relationship. Many rows in the Order table can refer to a single row in the Customer table. If a customer has only placed one order, that is an example of the last relationship you’ll look at—the one-to-one relationship.

Understanding these key concepts will be very useful when designing the structure for the database. After all, although the structure can be extended after the database has been built, the tables and the concepts cannot easily be changed. So, it is important to get them right from the outset.

Database Design Guidelines

There are five key problem areas that need to be addressed in order to avoid issues with the database once it moves from design to implementation (actually making it) and interacting with it. This last point is very important because creating the database is only half the story—you need to be able to talk to it as well, and this means remembering its structure and conventions.

The key potential problem areas are as follows:

  • Poor planning

  • Illogical names

  • No documentation

  • Incorrect domain splits

  • Insufficient testing

The way to deal with poor planning is to create diagrams of the data structure and check that it looks logical and that all the data areas are covered. This sounds obvious, but sometimes a database design will change radically from the creators’ original vision to the final diagram. To help build the diagram, it is important to stick with a convention.

Part of this convention is to avoid illogical names. A table or column name has to reflect the data stored inside it, possibly the type of data, as well as its purpose, and be easy to remember. A good naming convention will be intuitive and prevent the necessity to list the tables and their properties every time you need to refer to one of the columns.

However, to help with this, it is a good idea to solve problem number three—no documentation. Like all programming tasks, databases should be well documented so that any programmer can refer to the descriptions rather than having to look up the actual structure. Of course, for this to work, the documentation also has to be kept up to date.

Another advantage of documenting the design properly (and the initial diagrams count as part of that documentation) is that any incorrect domain splits will be found. This happens when information that should be in one domain finds its way into another. One example of this might be that the customers order IDs are listed in the Customer table rather than the Order table.

At first glance, this might seem logical; after all, for each customer, you can build a list of orders. However, it make more sense to keep the data in the order domain (in this case, an order number) with the orders and link it to the customers via the Customer ID. This is the data that both tables share.

Finally, databases, like everything else, need to be tested. The design can be tested by writing out use cases, which detail what data is to be stored as the result of a Web operation and where it will end up. The advantage of testing in this way is that it helps test the concept and feeds back into the design, so that before any concrete work is done, the database is correctly formed.

Some of these simulations will highlight the need for data normalization, as a way to improve the structure of the data stored in the database.

Normalization

Many of the key problem areas can be addressed by paying attention to something known as normalization. This can be defined as follows:

“Normalization defines a set of methods to break down tables to their constituent parts until each table represents one and only one ‘thing,’ and its columns serve to fully describe only the one ‘thing’ that the table represents.”

[http://www.simple-talk.com/sql/database-administration/ten-common-database-design-mistakes/]

An example of data that needs to be normalized is the use case that leads to customer and payment details being listed in the same table. It might make conceptual sense to have the address, phone number, and other details from the customer record next to the payment details, but this information is from two different domains.

One domain is the collection of data that relates to all the customers, and the other domain is the collection of data that relates to the payments. If you think about it, these are two different collections of data that should be stored separately. An address book and an order book, for example, in the real world, are kept apart.

The reason for this is that there will be many addresses and many payments, and although each customer might have only one set of address information, the customer might make many payments. If the data domains were mixed, it would become difficult to express these relationships, short of listing all the payments in a single table, along with the address information.

What then happens when you want to discard the payment data, or reorder the addresses, or even worse, collect all the payment data together, without the addresses as part of an accounting package? For these reasons, data domains should be kept apart as part of the process of normalization.

Normalizing the data, by finding the common element that links them (the Customer ID, for example), will lead to a better design for the data structure. Finding this relation and splitting the data up into separate tables is all part of the normalization process.

Clearly there is more to normalization than this initial overview, but the basic concept should be clear:

  • Identify data in the wrong domain

  • Find the item that links the domains

  • Split the data into two or more related tables

This last point is important, because it may well be that the only way to normalize the data correctly is to create more than two tables and cross reference between them. For most simple databases, this will not be necessary, but it is useful to remember that it is not something that is prohibited—a table can be split into multiple domains.

Creating the Design

Once the rough diagrams have been sketched out and the first normalization is complete, it is time to create the actual design that will form the basis for the ongoing documentation. The design should include, as a minimum:

  • Entity relationship diagrams

  • Data dictionary

  • Data relationship table

  • Conventions

The entity relationship diagrams need to show the high-level relationship between the tables, leaving out the individual data elements but showing the links between tables that identify relationships between data elements. The second set of diagrams should show the structure of each individual table, and the data elements and their attributes, in a similar form to the simple table design shown in Figure 7.2.

Tip

If the relationships cannot be shown between the tables due to size limitations, this does not matter as long as the overview diagram shows where those links might be, perhaps annotated with the name of the data element (field) and attribute that links them.

The data dictionary lists the individual data elements, by name, in alphabetical order. It should list, for each data element, the following:

  • The name—Following the conventions

  • Table—Where the element occurs

  • The type—Kind of data: text, number, and so on

  • The size (length) of data—Size of number, text length, and so on

  • Value constraints—Unique for the table, key value, and so on

The data relationship table lists, by entity, which tables it links, where appropriate. This is just a simple table that defines the relationships and is a text representation of the entity relationship diagrams (overview combined with individual).

Finally, any conventions used in naming and data domain normalization should be listed. This is a helpful reminder for the future when changes need to be made, and also in case the ongoing development needs to be handed on to another person or organization.

This might seem like a lot of work at first, but creating adequate documentation will help when it comes to developing the PHP scripts that will access the data. Moreover, the documentation will be invaluable should the design need to be extended.

Once the design is done, it is time to put the database together, using facilities that are available at most good Web hosts. Chapter 10 lists the items to look for when choosing a host and describes how to get the initial environment up and running. For now, assume that MySQL is installed (by the Web host) and that phpMyAdmin will be used to manage it. Again, this is a standard setup offered by most Web hosts.

Using MySQL

Generally speaking, there are three ways to access a MySQL database system:

  • Locally

  • Using the MySQL command-line software over the Internet

  • Through a Web interface

The first way (locally) applies only to systems that are running locally to the programmer or where an application such as Telnet is used to log in remotely to the server. This is generally available only to advanced users and those using specific packages geared towards providing a complete system that is remotely accessed.

The second interface uses the software provided by MySQL to access the database and is a command-line interface application. It can be used to run queries against the database (covered later) and interact with the system in a text-only interface. It is not as convenient for beginners as the last option.

The most common option for Web programming is through a Web interface, which uses an interface like phpMyAdmin, available on most Web hosts that use cPanel (an online Website management application that you’ll look at in detail in Chapter 10). This is the preferred option, and the one discussed in this chapter. The advantage is that it is used through a standard browser and provides an intuitive interface for manipulating the database.

The one you use will depend on the installation that has been done on your behalf—given the choice, I would strongly recommend the last option as being the easiest, even for advanced users. Its only downside is that it’s subject to the connection speed between the client and the server as well as the performance of that server.

Because of this, it is worth knowing about the MySQL client solution, in case the server on the other side cannot keep up with the phpMyAdmin load.

Local Login/MySQL Client

The application is found in the /usr/local/mysql/bin directory, on Linux/UNIX-based systems. Under Windows, the program is called mysql.exe and is located by default in the C:mysqlin directory.

It is a command-line package and can connect locally or over the Internet. It provides a simple interface to the database system. All the queries that you see in this chapter can be run through this text interface.

The program is invoked with the following command:

mysql -h <hostname> -u <username> -p

Valid values for <hostname> include localhost or the IP address of the machine hosting the database server. The–u option must be followed by a valid username, and the–p asks the database server to prompt for the password.

(The point is true for all databases that require a username and password to log in to MySQL. Any Web database provided by a third-party host will usually be set up this way; however, if you are running your own MySQL and Web server, you might allow local access without a username and password. Best practices, however, would seem to indicate that a username and password should always be used to secure the database.)

If everything goes according to plan, the users are then presented with a prompt where they may enter text, and press Return to send the command to the server. When they have finished, they should type quit to close the connection.

A similar procedure is used for remote access over the Internet using Telnet. This requires the use of a Telnet client to Telnet into the host and then run the mysql command in order to access the database locally. This is not typically offered by most Web host providers, but if you’re using this book in a commercial environment, it might be available.

phpMyAdmin

phpMyAdmin is a set of PHP scripts that provide access to a MySQL database installation through a series of Web pages. There are facilities to log in to, maintain, query, and test the database installation.

The main phpMyAdmin window is shown in Figure 7.3. It displays the standard options. Different installations may differ in layout slightly, but they all follow the same basic theme.

The phpMyAdmin main window

Figure 7.3. The phpMyAdmin main window

The left panel of the screen shows the objects in the system that the user can control. At present, it is empty, apart from a system table, which should not be interfered with.

The main part of the screen shows the current options that can be performed within the context of the items selected on the left side. This screen changes depending on which database or table is selected in the left pane. When more than one set of options is available, a menu is displayed across the top of the main pane.

The phpMyAdmin tool is accessed from the cPanel databases section, shown in Figure 7.4.

cPanel databases section

Figure 7.4. cPanel databases section

You’ll look at the full flow—from creating databases, adding tables, and more—later, but first take a look at the alternatives that are available apart from MySQL.

Alternative Database Packages

Considering the pace of change on the Internet, you might very well find that your Web host supports one of the alternatives to MySQL. There are quite a few on the market, of which the big four seem to be:

  • PostgreSQL—Also Open Source

  • SQL Server from Microsoft

  • Oracle—Perhaps the biggest hitter in the commercial database world

  • IBM DB2—Another commercial offering

Oracle and DB2 are for very large high-end database installations, generally speaking, and have a price tag to match. However, they have both been released in free versions that are limited in the support they give and therefore less attractive to Web hosting providers.

SQL Server from Microsoft is also a good, solid database system for medium to large Web databases and has the advantage of tight integration with the .NET Web environment. If you are using a Microsoft-based Web host, you will likely be using SQL Server.

They all support SQL and can all be connected to via PHP, but they have slightly different ways to manage them. Not all of the systems can be managed over the Web, for example. This is not the place to debate the pros and cons, but is the place to point out that there are others so you’re aware of them.

PostgreSQL

In some senses, PostgreSQL is considered to be the big brother of MySQL. It is faster in some areas, slower in others. However, it is more difficult to set up than MySQL, requiring compilation from source in some cases. The difficulty level remains high, whether the platform is Windows or Linux/UNIX-based.

The new version of MySQL (5.0 at time of writing) fixes all the previous differences between PostgreSQL and MySQL that made PostgreSQL more attractive. This includes support for views, stored procedures, and cursors, which are all intermediate to advanced features.

The bottom line is that MySQL has a bigger user base and used to be less powerful. However, it is catching up, but at the same time, any issues that PostgreSQL did have are being solved, so they effectively meet in the middle.

For Web databases, MySQL is probably better and has certainly been more widely implemented.

Oracle

Oracle is for high transaction volumes with big databases and big machines, with the emphasis firmly on carrying out the transactions safely. It is a good solution for large organizations with complex data needs.

Oracle is both commercial and very expensive. Although it comes with many utilities to help put it all together, it still a pretty time consuming and painful process. The final result also requires tuning to get the best out of it, but once tuned, it is blazingly fast for high transaction volumes.

It is not used on many Web hosts because of the price, and it is very memory intensive. This means that if your system is running many other processes at the same time, they will take a bit performance hit because of Oracle’s requirements. Either that, or the Oracle system itself will take a big performance hit due to scarcity of resources.

There is a free version for those who feel like experimenting, but the free version has limits on the maximum number of users and so will only be appropriate for single user installations.

IBM DB2

Finally, DB2 is again a commercial offering. It is a bit big and clunky with a lot of legacy features that make it cumbersome. However, at the top end of data storage solutions, it is still considered to be very attractive for large-scale organizations.

Like Oracle, it is blazingly fast. To get the most out of it, you must run it on a dedicated IBM server, although it does run happily, if at a lower performance, on Linux.

Like Oracle, there is a free version. The free version has no limits on users, but instead has limits on the CPU and memory that it is willing to use. This means that for all but the smallest installations, it will probably not be quite enough.

Hence, Web hosts will not tend to choose it, but for a single user local installation, it might be an attractive proposition.

Given the propensity of Web hosts (both free and commercial) to choose MySQL, this is the database platform covered in this chapter.

SQL Server

There really isn’t much to add about Microsoft’s SQL Server. It is commercial and was released in a kind of mini-version with several useful features missing, such as full text search, as a Desktop Edition.

Because the majority of cheap and free Web hosts tend to avoid Microsoft operating systems, chiefly because of the high requirements that the operating system places on the server and the fact that licensing is relatively expensive (especially compared with freely available OSs like Linux), MS SQL rarely gets a look.

However, if you are running a Microsoft Web server and can afford it, MS SQL is a good product and integrates well with the Web environment. Given that MySQL is also available for Windows, it might be a better route for those just getting started, for cost reasons alone.

The SQL Language

In the same way that the language for writing Web pages is HTML and that the server can be made to produce that HTML using PHP, a database has its own communication language, called SQL (Structured Query Language).

SQL is used for all manipulation of databases, from constructing tables to putting data into them and taking data out. There are also some user management functions, but the “Just Enough” Web programmer will not need these. In case you do, they are available through the much prettier phpMyAdmin interface.

The same is true for the table creation and manipulation functions, which you’ll look at through the phpMyAdmin interface, because it is so much more intuitive. Before learning how to manipulate the data using SQL, though, you need to look at the theory of operation.

Theory of Operation

The reason I cover this subject first, and don’t discuss creating databases straight away, is that it is useful to know how you’re going to manipulate the data before you set about making the tables. Knowing how to put data in, take it out, and otherwise operate on it gives you a good idea about how to create the tables that will store it.

SQL transactions (a request and result pair) are written using queries. A query is just a well formed instruction that tells the database what it should do, to what table, and with what data, as well as the kind of result it should return.

To put data in the database, you use the INSERT query. This needs to specify the table, columns, and data that will be placed into the database. To see what data is in the database, you use the SELECT query, which needs to provide the database system with some parameters such as the table name and what to look for (as well as which columns to look at).

If you need to change data, you use an UPDATE query, which operates on one or more columns in a table, based on some input criteria. If you want to update only a single row, that query data has to identify that row uniquely. Otherwise, you risk running the same update on multiple rows, which is why it is useful to have a key value that’s unique across rows.

Finally, to remove data, you use the DELETE query, again with values that should identify the data that you want to remove. The entire table can be emptied with the DROP query, but this should be done via the phpMyAdmin interface.

Inside the query values, you use wildcards when you want to match multiple presentations of the same piece of information. For example, assuming you want to retrieve varying entries in the Name column:

"Ban%"

Matches Banks, Ban, Bananarama, and so on

"%son"

Matches Johnson, Anderson, Jameson, and so on

The underscore character (_) can be used in the same way to match single letters. These wildcards are used very frequently in queries because you’ll quite often want to return a variety of values so that the users can select one.

There is also a concept known as joining which allows the programmer to create a set of result rows that is a combination of two tables in a single query. The JOIN returns the columns that have been specified in the query, but only one row per result.

Finally, a very useful command is the SHOW DATABASES; instruction. It tells the system to describe all the databases that it has under its control, as well as their structure. These queries can be used within a PHP script, entered interactively via the MySQL interface, or, more frequently in this context, entered through phpMyAdmin.

A final note—a query is terminated by a semicolon, and any values are contained in double or single quotes. Groups of values (that is, column names) are contained within parentheses.

Attributes and Data Types

Before you look at how to extract data in detail, you need to understand the mechanism that MySQL provides to define the kind of data that can be stored, called data types. Different data types are supported by different platforms; what follows here is true for MySQL.

MySQL only has built-in data types and does not allow user-defined data types. Even so, there are many data types that most programmers will never use. Only the most common ones are explained here in detail.

There are a few points to remember. Once a column has a data type associated with it, it can’t be easily changed. The designer must choose wisely to avoid potentially having to do a lengthy export and import operation to rebuild the table with new column types.

Part of this is choosing the correct sizing of a data type, sometimes known as an attribute. MySQL offers one piece of very useful functionality in this respect—the ability to resize the data type. This is called attribute promotion.

The process is also well documented and implemented and is simply a back up and restore process, restoring to a column with a larger data type. The data types used in this book are the basic integer, floating point, text, and date types, but there are plenty more that are outside the scope of this book.

A summary of the types follows:

  • INTEGER—A normal sized integer can be unsigned to go up to 4 billion, or –2 to +2 billion otherwise.

  • BIGINT—Signed or unsigned, this can represent values that are far bigger than most users are ever likely to need.

  • FLOAT—Signed or unsigned, single precision, but still more than ample.

  • DOUBLE—Double precision for mathematical operations, this type is not usually needed.

  • DATE—The date, in a predefined format.

  • DATETIME—The date, and time, in a predefined format.

  • TIMESTAMP—This is an automatic stamp on a row that gives the last operation.

  • TEXT—Up to 65,000 characters, doesn’t need to be explicitly sized.

  • VARCHAR—Up to 65,000 characters, must be sized in column description.

  • BOOLEAN—True or false.

There are more, but they’re not really particularly useful. Wherever possible, one should try to use native data types, that is, DATE and not a date string, or the BOOLEAN data type and not an INTEGER with either “0” or “1” in it. This is because they will take less space in the database schema when using native types.

This might not matter for one row, but Web databases are often designed to have many rows, and it will increase your data storage requirements drastically if you do not try to use only the space you actually need.

Finally, when a column is created with a data type, another attribute can be set, called the key, which is a unique value that identifies that row. No two rows can have the same key in that field. The key is often used when joining two tables, rather like the Customer ID field in the example from Figure 7.2.

Extracting Data

The basic form for the SELECT query is as follows:

SELECT <columns> FROM <table names> WHERE <condition>
                 ORDER BY <column>;

This code simply states that you want the database system to select some rows, from a given table, where a certain condition is met, ordering by one of the columns. Everything except the columns and table name is optional. So, the following is perfectly acceptable:

SELECT * FROM Customer;

This query, using the design from Figure 7.2, would return all the rows from the Customer table, and all the columns therein. If you wanted only the Name column, you would use a query such as:

SELECT Name FROM Customer;

This query could then be enriched to select only those customers whose names end with o:

SELECT Name FROM Customer WHERE Name = "%o";

Furthermore, you might like to order these by the Name column:

SELECT Name FROM Customer WHERE Name = "%o" ORDER BY Name;

You can also join this result with a result from a similar table, using the primary key as the reference point, and using a NATURAL JOIN. If you assume that the Customer table has Customer_ID as its primary key, and that Order contains a Customer_ID column, you can issue the following SQL query:

SELECT * FROM Customer NATURAL JOIN Orders;

This query has the effect of showing only those customers who have an order (that is, the Customer_ID columns can be joined). The result displays only one Customer_ID column for each row.

The NATURAL JOIN assumes that there is a key available to join the two tables on—this key then gives the reference that allows the rows to be connected using a single reference point: the CUSTOMER_ID in this case.

There can be multiple joins performed, to access many tables, as long as they can all be held together by a series of key values. If you do not know how the data will look when it is retrieved, you can even specify the column to perform the join on:

SELECT * FROM Customer INNER JOIN Orders USING (Customer_ID);

All of these queries can be refined somewhat by the use of aliases and the ON keyword. For example, the previous query can be written using an INNER JOIN rather than a NATURAL JOIN to create the same effect.

So, the INNER JOIN lets you specify the actual column (which might not be the primary key) upon which to join the two tables. Re-writing the previous NATRUAL JOIN query to use INNER JOIN might result in the following:

SELECT Name, c.Customer_ID, o.Quantity FROM Customer c
    INNER JOIN Orders o ON o.Customer_ID = c.Customer_ID;

The ON clause is necessary because, unlike a NATURAL JOIN, the primary key is not necessarily taken into account. Without it, the query returns one row for each o.Quantity per customer. This is slightly illogical, but to get the relationship right, you either use an ON clause or specify the orders first.

Generally speaking, it is best to stick with the simple NATURAL JOIN and INNER JOIN. . .USING variations.

Finally, the GROUP BY clause can be useful in grouping data together. You’ll see more of this in a moment, but assuming you wanted a list of all the customers who had made an order, you would do the following:

SELECT Customer_ID FROM Orders GROUP BY Customer_ID;

Deriving Data

It is also helpful to use SQL to derive data and make calculations while carrying out queries. You can, for example, count the number of entities, sum fields, and do other calculations that come back in the result table.

So, if you wanted to count the number of customers who had made an order of more than 100 units, you would perform the following query:

SELECT COUNT(*) FROM Orders WHERE Quantity > 100;

As you would expect, the WHERE clause can take all manner of different comparison operations, including all the mathematical operators and the LIKE clause for strings. These are all well documented if you get stuck, but as long as you stick to the following conventions, you should be capable of performing the majority of the queries that you will ever come across:

<, >, =, != >= <= for numbers

%, _ for strings

Calculations can also be performed on the values of returned data. If you knew the price of a given item that had been ordered, for example, you could also work out the total value of the customers’ orders:

SELECT Customer_ID, Quantity * 2.50 FROM Orders;

This query assumes that all items have the same price, 2.50, which is rarely the case. So, it would be useful instead to be able to look up the price and use that number to multiply by the Quantity to get to the final value.

The whole query is as follows:

SELECT Customer_ID, SUM(Price * Quantity) FROM Orders
    INNER JOIN Products USING (Product_Ref) GROUP BY Customer_ID;

This is the most complex type of query that you are going to look at, so it is worth taking some time to see what it is doing. Clearly, the Customer_ID comes from the Order table, and the Price comes from the Products table.

The INNER JOIN has been specified using the Product_Ref as the pivot point for the join operation, so only the products for which there has been an order for each row of the Products table are returned. So, the first pass sees a table that contains, for each row, the Customer_ID and a Price * Quantity for each class of items that has been ordered.

However, you have also specified that you want to see the SUM of these, grouped by the Customer_ID. This causes each set of rows to be summed and grouped by the customer who ordered them. If you had chosen to group by the Product_Ref, you would have the total value ordered by product instead.

Subqueries

Sometimes it is useful to break queries down into subqueries. A subquery generally results in a single column of data being returned. For example, the following illustrates a subquery in action:

SELECT Product_Ref, COUNT(Product_Ref) FROM Order
  WHERE Cust_ID =
    (SELECT DISTINCT Cust_ID FROM Order ORDER BY Cust_ID)
             GROUP BY Product_Ref;

This query might not do anything terribly useful, but it uses a subquery to count the number of people with the same name. Of course, it can also be done with a simple WHERE clause.

It is important to note that MySQL does not allow you to mix tables and subqueries, so the following is not allowed and must be solved by the use of an INNER JOIN:

SELECT cName FROM Orders AS o
  WHERE cName =
    (SELECT Name FROM Customer WHERE Name = o.Name)
             GROUP BY cName;

This query will not work, because cName is not present in the Orders table, even though it has been defined in the WHERE subquery. The INNER JOIN looks like this:

SELECT Customer_ID, Name FROM Orders INNER JOIN Customer
   USING (Customer_ID) GROUP BY Customer_ID;

Logical WHERE Statements

It is also possible to combine conditions in the WHERE clause of the SELECT statement to filter data further. For example, a query to select only those customers who have ordered more than 100 units of a specific Product_Ref could look like:

SELECT Customer_ID FROM Orders WHERE Product_Ref = 456
                    AND Quantity > 100;

All of the now-familiar Boolean statements can be used for these kinds of queries, and parentheses can be used to separate subclauses, just like in other programming languages (JavaScript and PHP included). This allows you to build up very complex selection criteria and only return the data that you are interested in, which is very useful because it helps the processing of the data as well as reducing the memory overhead associated with the result set.

PHP and MySQL

The previous discussion represents a basic subset of the SQL SELECT statement, but, combined with programming, it is more than enough for basic tasks. As you become familiar with Web programming in general and SQL in particular, you can expand your use of these queries beyond this as projects require.

The easiest way to use SQL in Web programming is to combine it with PHP logic and keep the queries as simple as possible. When a small number of rows are returned, the efficiency gain is not much greater when performing advanced SQL queries and it is much easier to understand a comparison of two result sets in PHP than it is to create a complex inner join.

So, the advice here is to understand how to derive data from a table and how to extract data from it, but to leverage the power of PHP and MySQL together in creating Web applications rather than getting caught up trying to understand every intricacy of the SQL language.

With that in mind, the last piece of the puzzle is how to insert data into the database, and how to update it once it is there.

Inserting, Modifying, and Deleting Data

There are three basic commands for manipulating data in the tables, usually in the context of Web programming, and they will be issued from PHP in order to change the data following a user action of some kind.

This section discusses the INSERT, UPDATE, and DELETE FROM commands, which are used to put data into a table, change data that has been placed in a table, and delete data from it, respectively. To insert data, you use the INSERT command, as follows:

INSERT INTO <table> VALUES (<value>, ...);

The ellipses indicate that there are multiple value entries possible. In fact, there must be a single value, of the correct type (that is, in quotes for strings), for each column of data. Named columns can be used instead, but to do this you must set each column explicitly (unless the columns contain automatic data).

INSERT INTO <table> SET <column>=<value> or DEFAULT, ...;

The DEFAULT keyword simply assigns the default value that has been specified in the definition of the table. When using the VALUES clause rather than the SET clause, you can also use DEFAULT or simply a space to indicate that there is no value for this entry. However, that cannot be performed if column has been set to NOT NULL (meaning it must always have a value set) in the table-creation phase.

You’ll see examples of automatic values—DEFAULT and NULL/NOT NULL—in the section detailing the setup of tables using phpMyAdmin.

Once data has been inserted, it can be retrieved, updated, or deleted.

In order to change data with either the UPATE or DELETE FROM commands, it is usually a good idea to issue a SELECT command from PHP in order to determine that the query can be satisfied. The reason for doing this is twofold—to make sure that you are updating the correct data, and to make sure that the selection criteria are well formed.

This might seem unnecessary, and it is quite possible to omit this step entirely, as long as you are confident about the query that is going to be issued. The UPDATE command is in two parts: the items to be updated and the new values, and the WHERE clause that selects the rows that are going to be operated on.

Multiple rows can be updated at once, so some care must be taken to make sure that the result is what you think it is going to be. The general form for the UPDATE command is as follows:

UPDATE <table> SET <column> = <value> WHERE <query>;

So, if you wanted to increase all the prices in the Products table by 10% where they are less than 0.9, you would issue the command:

UPDATE Products SET Price = Price * 1.1 WHERE Price < 0.9;

(This example assumes, of course, that none of the prices are set to 0.)

The usual forms of WHERE clauses can all be used, and multiple columns can be set using a comma-separated list of column = value pairs.

You can delete data from the table using the DELETE FROM command. Similar to the UPDATE command, it requires a WHERE condition in order to select the rows to be deleted.

DELETE FROM <table > WHERE <condition>;

So, to delete all the products from a table where the price is less than 1.00, you could issue the following query:

DELETE FROM Products WHERE Price < 1.00;

These are the simplest forms for the three modification commands and are more than enough to get you started. As you examine and encounter more complex situations, you’ll naturally expand your knowledge through experimentation and frequent trips to the user guide. For those needing only simple table management, this discussion will suffice.

ROLLBACK and COMMIT

Databases are transaction-based. This means that until the changes are committed to the database, they can be undone. The ROLLBACK command will undo the changes in some circumstances, but the documentation should be checked to see exactly what those are. Generally, it is accepted that changes are committed upon logout from the database system.

Until that point, they can be undone by using the ROLLBACK command. If you are sure of the changes and want to commit them immediately, you should issue the COMMIT command to be sure that the changes take hold immediately. This can be quite important for high traffic sites, but check your documentation before you rely on the COMMIT and ROLLBACK mechanisms.

Creating and Altering Tables

For those not using phpMyAdmin, or for those who are curious about the SQL that it generates in order to manage tables, here is a brief précis of the table-management system. It centers around two commands—CREATE TABLE and ALTER TABLE—with which it is possible to manage the columns in individual tables.

The first of these commands, CREATE TABLE, takes the following generic form:

CREATE TABLE <name> (<column> <data type>, . . .);

The ellipsis indicates that there may be many column, data type pairs, all contained in parentheses. So, to create the Customer table from the previous examples, you might execute a query such as this:

CREATE TABLE Customer (Name VARCHAR(255), Customer_ID INTEGER);

Any of the valid data types can be used, but you should avoid using reserved words as table or column names, because this will confuse the SQL parser. For example, naming a table “Order” rather than “Orders” is not a good idea, because the parser might confuse the table name with the ORDER BY keyword.

Apart from this restriction, there are very few restrictions on names of entities beyond the usual constraints that they should not contain spaces or start with a number. Respect these rules and you will have no problem managing the tables.

You read about keys in the section that detailed how to use the JOIN function. To create a key on a table to use as a pivot for a JOIN,youusethe ALTER TABLE command:

ALTER TABLE <name> ADD CONSTRAINT <name>
                   PRIMARY KEY (<column>);

Other common uses for ALTER TABLE are to add and remove columns. So, if you wanted to add a new column to the Customer table, you would use:

ALTER TABLE Customer ADD COLUMN Address VARCHAR(255);

You could also add more than one column by using a set of column, data type pairs in parentheses. Similarly, you can delete the newly added column by using the DROP COLUMN clause with the ALTER TABLE command, as follows:

ALTER TABLE Customer DROP COLUMN Address;

Finally, you can use DROP with two other database entities—tables and constraints. Again, using the Customer table as an example, you can remove the primary key (assuming that you called it primary_key):

ALTER TABLE Customer DROP CONSTRAINT primary_key;

Once you are satisfied that you no longer have a use for the Customer table, you can then delete it with the DROP TABLE command:

DROP TABLE <table name>;

As noted, it is both more common, and easier, to manage the tables using phpMyAdmin; however, it is worth knowing the general layout of these commands so that you can understand the SQL that phpMyAdmin generates. In this way, if you need to create temporary tables programmatically, the commands will be familiar.

User Management

Finally, a note about user management. It is possible to restrict certain commands to certain users. This is a good idea if, for example, there is a database administrator user and a Webuser. The Webuser performs only basic queries on the database and should probably not have enough permissions to create and drop tables, for example.

The GRANT command performs this assignment of permissions to a given user, and the generic format for it is as follows:

GRANT <commands> ON <table> TO <user>;

Using the example of a Webuser, you might decide to let such a user only select from a specific table and possibly perform updates with the following command:

GRANT select, update ON Customers TO Webuser;

Of course, this command assumes that the Webuser exists in the database system. Although this can be done from within the SQL language, you’re going to see how to set all this up using phpMyAdmin.

Databases for Websites

This section is all about setting up and using databases on Websites using MySQL and phpMyAdmin. It is not designed to be a general guide to databases but concentrates on getting up and running as quickly as possible with a Web host that offers the required tools.

The section covers setting up and administering a database, as well as connecting to it through PHP to run queries as a server side script. This last point is important, because it is highly likely that a Web database exists to provide an interface to the Web application.

For example, in a content-management system, all the content is stored as text in a database and extracted only when it must be rendered. This makes CMSs flexible and dynamic, because they can display any information from the database in a way that’s coherent with respect to what the end user has requested.

So, if the user wants to see all the blog posts with “cat” in the title, a query on the database will return all the relevant rows. These are returned to PHP as a two-dimensional array of rows and columns that can then be manipulated programmatically.

Data can be inserted and updated, too, by building query strings that contain variable values. In fact, any query that can be performed on a MySQL database can be done through PHP, and it is this feature that allows phpMyAdmin to perform all manner of database operations via a Web interface.

First, however, you need to set up the database with the cPanel MySQL Databases tool.

Typical Database Setup

The following steps should usually be performed with the MySQL-provided Databases tool, which installs under cPanel in the Databases section. Whether they are performed using a wizard style interface or manually, the steps are the same.

Note that there are some restrictions on the names that can be used for databases. These are essentially the same as for any other kind of named entity in programming—they cannot contain special characters (beyond the _ character) or start with numbers, for example. They also cannot contain space characters.

The first step is to create the database with MySQL Databases. This is shown in Figure 7.5.

MySQL database wizard step 1 (create your database)

Figure 7.5. MySQL database wizard step 1 (create your database)

As seen in Figure 7.5, the only option is for the user to enter the database name. The next step is to create the users in the system who are going to be attached to the various databases. This can be seen in Figure 7.6. As you will see, the examples here use the name jewpexamples, to which the system has added the prefix jewp_, this being the username to which the database is attached.

MySQL database wizard step 2 (create users)

Figure 7.6. MySQL database wizard step 2 (create users)

As you can see in Figure 7.6, each user is created by assigning the user a username and password. Once the users have been created, they can be added to the database that has just been created. The users are created in the system with privileges that are set from the start and applied to the database being built.

This example uses a webuser user and password, which is the generic, non-administrative user. For this reason, in the next step, you only select a basic collection of privileges.

The users can pick and choose the privileges that they should have based on the operations that each user is expected to carry out, and then add them with these privileges to the database. This is shown in Figure 7.7.

MySQL database wizard step 3 (add users)

Figure 7.7. MySQL database wizard step 3 (add users)

The privileges that are selected for the Webuser, for example, would be SELECT and INSERT and perhaps UPDATE and DELETE. Once the privileges have been selected, the user clicks on Add, and the user is added to the database domain. Users can be added to multiple databases through the phpMyAdmin interface, if that is required.

The last screen, which confirms the operations, is shown in Figure 7.8.

MySQL database wizard step 4 (finished)

Figure 7.8. MySQL database wizard step 4 (finished)

From here, the user can add another user to the database, a Webadmin user, for example. Such users should have all possible privileges, so long as they are never used in a PHP script. This is for security reasons—the master user should be accessible only from the cPanel interface, which is relatively secure.

If the cPanel implementation does not seem to have the MySQL database wizard, or if the process is too cumbersome, there is another possibility. The MySQL database form, shown in Figure 7.9, allows the administration to be completed in fewer steps. This interface will be located in the cPanel Databases section along with the other database administration packages provided by the Web host.

MySQL database form (adding a user to the database)

Figure 7.9. MySQL database form (adding a user to the database)

There are still two steps—create the user and add the user to the database—but the form in Figure 7.9 puts all the options in one place and can be easier for advanced users.

One point to be wary of is the names you chose for your users. Most hosts will add the domain or subdomain to the start of the chosen username, to keep it unique. This means that if you type webuser, the system will make that jewp_webuser, if the subdomain is jewp.domain.com. The same usually applies to database names.

These naming conventions are not visible when making queries from phpMyAdmin or PHP scripts, but are added automatically at the system level. They do mean, however, that in the administration panels, the users need to be aware that the names are not the same as the ones that they typed in originally.

Database Tools: Using phpMyAdmin

Having followed the previous steps, you should now have a Web database installed on your host. The next phase is to start administering it, adding tables and so forth, before it can actually be interacted with using custom PHP scripts and Web pages.

The tool used here to do this is phpMyAdmin. You can also write your own in PHP, but considering that the tool exists, it is better to make use of it. Because any query can be performed from PHP, some packages make use of this to do their own database administration.

When you first enter the phpMyAdmin system, the page is broken into two panes. In phpMyAdmin, there are several levels of administration, and the menus in the main frame are context-sensitive.

When phpMyAdmin first opens, it shows the first level of administration. Here, no databases are selected, so operations are performed on the whole cluster. This is shown in Figure 7.10.

phpMyAdmin main page

Figure 7.10. phpMyAdmin main page

The second level allows administration on the database level. Here, you can create tables in the database. On the left side of Figure 7.10, note that the database you created previously appears (jewp_examples) and it is currently empty—the number in brackets indicates zero tables.

You can select the database by clicking the name on the left side, which will then change the context of the administration system to the database level. From here, you can create a table.

Creating a Table

When an empty database first opens, the screen looks like one shown in Figure 7.11. Note that there are additional menu items across the top of the main window frame. These menu items indicate a number of operations that can be performed on the database and the tables contained within it.

Create table page

Figure 7.11. Create table page

One of the most common uses for a Web database is to store a Weblog or blog. So, this section shows how to design a very simple Weblog storage system that you can use to store your thoughts about the world. At this point, you need to design a table to contain the data that the blog is to contain.

Blogs usually have space for the date, a title, some kind of summary that can be submitted to RSS clients and search engines, the actual content of the blog, and a unique ID to identify the blog entry. Each of these will have different data types; notably, the blog content should be a TEXT object (with no size limit), and, for efficiency, you should limit the size of the title and summary.

This might lead to a set of fields, as follows:

date

Stores a DATE data type

title

VARCHAR; could be limited to 30 width

summary

VARCHAR; limited to 255 width

content

TEXT object

id

BIGINT; assuming that you will do a lot of writing!

The first step is to create the table name, blog_entry, and tell phpMyAdmin how many fields it should have, as shown in Figure 7.11.

You can fill in the table name and the number of fields and then click the Go button. This will create an empty table and open a page where you can set the field names and attributes. Figure 7.12 might look intimidating, but it is really quite straightforward.

Adding column attributes (1 of 2)

Figure 7.12. Adding column attributes (1 of 2)

Of the first set of attributes, seen in Figure 7.12, you are only going to concern yourself with the name, type, and length. From top to bottom, you can see some familiar data types and length information. In this case, all that the lengths do is limit the field title to 30 characters and the summary field to 255 characters.

In the second half of the screen, shown in Figure 7.13, you need only be concerned with the Null and Extra columns and the set of radio buttons on the right side.

Adding column attributes (2 of 2)

Figure 7.13. Adding column attributes (2 of 2)

Again, working from top to bottom in Figure 7.13, you can see the “not null” entry for all the fields. This means that when these fields are inserted into the database, they must have a value associated with them. This is true for all except the last field.

The last field has a auto_increment entry in the Extra field. This value means that, for this field, MySQL will automatically assign an incrementing number as data. Note here that the radio button indicating that the entry must be Unique is selected. This is normal and a side effect of using auto_increment.

The other key radio button selected is on the first field, which is the date and is identified as the primary key. So, if you were to perform a join on this table with another that contained news events by date, you would be able to produce a list of blog entries for those dated news events using the date field as a pivot.

After all the fields are filled out and the attributes set, you can click the Go button, and the SQL query will be built by phpMyAdmin and then sent to MySQL. The result of this is seen in Figure 7.14.

Creating a table: the SQL statement and result

Figure 7.14. Creating a table: the SQL statement and result

Any time that the Structure menu option is clicked, the fields will be listed as they are at the bottom of the screen in Figure 7.14. The top half shows the SQL statements that were used to create the table. The same layout is used for the results page of any administration tasks that are executed via SQL statement.

Once the table has been created, it appears underneath the database name in the left menu and can be selected just by clicking on it. All the tables that belong to a database are listed in this way.

Now that the table exists, you can populate it with some initial data (for test purposes) by using the Insert tool available from the menu strip along the top of the main window frame.

Inserting Data

Inserting data through the phpMyAdmin interface is not something that is terribly common. However, it can be useful to be able to insert data from time to time, and the interface provided is flexible and straightforward to use. The initial view is shown in Figure 7.15.

Insert row interface

Figure 7.15. Insert row interface

Note that there are two ways to enter data—via a function or by typing in the value. The date field, for example, has been populated using the NOW function, which will place the current date into the field.

The id field has been left blank, because you know from the table definition that it is an auto_increment field. This is where the Data Dictionary comes in useful, because there is no evidence from the interface that this field may be left blank even though the null column is empty, indicating that a value is required.

Clicking the Go button will submit the HTML form, and phpMyAdmin will construct and pass the query on to the MySQL database. The result is shown in Figure 7.16.

Inserting a row: the SQL statement and result

Figure 7.16. Inserting a row: the SQL statement and result

In Figure 7.16, note the information area at the top of the frame, which shows the status of the INSERT query. The query itself is expanded below that. The query should look familiar, except that the columns have all been individually named (which is a slightly different usage to the one previous). The VALUES list appears in parentheses after the INSERT INTO clause.

You can also use this method to construct sample queries that can be used as a base to create your own queries in PHP code. Having thus inserted some data, you can now use the Browse interface to look at the data.

Browsing Your Database

You access the Browse interface by clicking the Browse menu option. The initial view can be seen in Figure 7.17.

The phpMyAdmin Browse interface

Figure 7.17. The phpMyAdmin Browse interface

At this point, you only have one record in the database, and you know that it is there, so this is a fairly useless exercise. However, it illustrates one of the interfaces available in phpMyAdmin: the ability to quickly examine the contents of the database through a Web browser.

If the resulting information is insufficient (it is usually limited to 30 rows), keep in mind that it is possible to adjust the quantity and display options. The first two boxes in the input area (under the query expansion) allow the users to adjust which rows are shown.

If you wanted to view rows 31 to 60, you would enter 31 in the far-right box. This would then change the LIMIT clause in the query to LIMIT 31, 30 instead of 0, 30. You can also change the display mode and point at which the headers (column names) are repeated.

The query can be copied and pasted into the SQL box accessible through the SQL menu option and then run as a SQL query. This is an alternative way to look at data in the database, and also a good way to practice using SQL SELECT statements. You’re encouraged to play with the Browse and SQL tools until you’re familiar with them. When you’re prototyping your database, you’ll likely spend quite some time trying out queries and testing the database before you write the actual PHP code to access it.

Exporting/Importing Data

There are two useful tools that the phpMyAdmin collection offers for processing blocks of data. The first, Export, is useful when you want to extract the data into a variety of formats. This section doesn’t go into great detail, but it is useful to make a few comments about the interface.

You can access the Export tool by clicking the Export menu option; Figure 7.18 shows the initial form that’s displayed.

The phpMyAdmin export interface

Figure 7.18. The phpMyAdmin export interface

Figure 7.18 shows a list of export formats down the left side and a collection of options in the main frame. Most of these options relate to the default selection for the format of the export—SQL.

Choosing the SQL export format will create the data schema as a series of SQL statements that can be re-run on a different database platform to recreate the table. This is a useful form of quick backup. Other export formats include Excel, Word, and PDF.

Data can also be imported in a variety of formats, through the Import tool. This is accessed by clicking the Import menu option. The initial screen is shown in Figure 7.19.

The phpMyAdmin import interface

Figure 7.19. The phpMyAdmin import interface

The supported import formats are far poorer than the export formats, and are all text-based. The possibilities are as follows:

  • CSV. —As an export from a program such as Excel, for example

  • SQL. —As a series of SQL statements

The interface is self-explanatory, except for the middle box, which allows for interruptions. If you use this option, you must query the database to find out how much data was imported and then enter the number of rows to skip in the relevant box before the next import operation is attempted.

Operations

Finally, there are a number of special operations found under the Operations menu tab on the main frame. The initial selection of possibilities is shown in Figure 7.20. From here, you can copy, move, reorder, or alter the table in a variety of ways.

This screen is useful if you need to create a copy of the database for test or maintenance reasons. It is also handy if you need to rename the table for any reason. Beyond these specific issues, the Operations tool should be left alone.

phpMyAdmin operations main screen

Figure 7.20. phpMyAdmin operations main screen

Indexing, Backing Up, and Maintenance Issues

Before you look at connecting to the database, I need to mention a few points about maintaining databases. A database is a piece of living software in a living data space and will benefit from the occasional bit of maintenance.

Maintenance in this case just involves taking a backup, occasionally cleaning out the data storage area, and rebuilding the indexes. Again, phpMyAdmin makes this process incredibly straightforward.

Backing Up

The easiest way to do this is just to export the data from time to time, as you saw in the analysis of the Export tool. This is an essential part of the chores that you need to do as a Website owner—because if the data is ever lost, a large part of your business goes with it.

So, you should make a backup schedule and stick to it rigidly. You can refer back to the Export option in phpMyAdmin to remind yourself about the options that are available. If you are using cPanel to administer databases and the underlying Website, it’s likely that there is a backup option in the Databases section.

This is shown in Figure 7.21.

The cPanel backup main screen

Figure 7.21. The cPanel backup main screen

Now, the Web host should take some of the responsibility, and will hopefully have set up MySQL to prevent data loss. In addition, the Webhost should make file-level backups for you, but there is no sense, given the availability of an easy tool, in ignoring the possibility to make your own backups on a regular basis.

If you’re running your own database on your own server, there are quite a few additional things that you can do. For example, it is possible to have MySQL schedule its own backups. The set and forget feature that this offers makes it easy to make sure that the system is adequately protected.

Otherwise, a timely manual backup will just have to do.

Rebuilding Indexes

If you’re constantly removing data and adding new data, your tables will become un-optimized. In the worst case, queries will start to take a long time to execute. If this happens, the database can be optimized.

All that the optimization involves is periodic rebuilding of the index. Without going into too much detail, this will improve transaction speed and lookup efficiency as the SQL platform organizes its data in an optimized fashion.

Again, this is best done from the phpMyAdmin interface, the result of which is shown in Figure 7.22. If you look back at Figure 7.20, as part of the Operations menu option you will see a small section at the bottom called Table Maintenance.

The phpMyAdmin optimize screen showing operations

Figure 7.22. The phpMyAdmin optimize screen showing operations

Clicking the Optimize Table option causes the system to re-index the database and display the screen shown in Figure 7.22.

As you can see in Figure 7.22, this can also be done using a SQL query on the database. One way to do that is through the phpMyAdmin Query tool, for example. Be sure to select the table first; otherwise, the option does not appear in the menus because of the context-sensitivity of the menu system.

Connecting to Databases with PHP

Finally, with the database up and running, it is time to look at how you can interface with it from PHP. The following code represents the bare minimum that is needed while still being fairly robust. Luckily, PHP makes it very easy to integrate with MySQL, offering a whole collection of mysql_ functions that are installed as standard.

Once everything has been set up, it’s a good idea to test before doing any actual development. The code that is developed in the testing process, for making connections and tidying up, can be abstracted into an external PHP file that can be reused for the same host.

Before starting to create the code, you need to have your connection parameters at hand. This includes the username and password (remembering that the user-name will probably have the subdomain prefixed to it—that is, jewp_webuser)as well as the IP address of the server.

The first part of the conversation with MySQL is to create a connection over which to communicate.

Making the Connection

The PHP documentation calls the connection a link, and I’ve used this convention in the code examples. The best way to connect to a MySQL database is with the following stanza:

$link = mysql_connect('mysql_host', 'mysql_user', 'mysql_password')
    or die('Could not connect: ' . mysql_error());
echo 'Connected successfully';

There are a few things to note about this code. The first point to note is that this code introduced a new concept in PHP, the or keyword. All that this does is perform the equivalent of a test against the return value of the function, and is the equivalent of:

$link = mysql_connect('mysql_host', 'mysql_user', 'mysql_password'),

if ($link == null) {
   die ('Could not connect: ' . mysql_error());
}

The or keyword is just a little bit more convenient. It can only be used in this case because the die command, as you might remember from Chapter 6, stops the script immediately.

The next thing to note is that the example has hard-coded the connection parameters. If the file is to be included globally, this is not a problem, but it might be better to use global variables to store the connection parameters under certain circumstances.

The mysql_error function returns a text-printable description of the error from the database server. If the code is to be deployed in a public access environment, it might be a better approach to use the if construct so that contact details can be posted to allow the visitor to alert the development team that there is a problem.

If there are any issues relating to usernames or passwords, check with your Web host, which might not set up cPanel and MySQL according to the book. If all else fails, you can use your cPanel username and password to log in to the server, but be aware that this combination will probably have superuser status and be able to do any operation on the database.

One final point is that the mysql_host parameter can probably just be set to localhost for the vast majority of installations. If this fails, use the IP address of the server, which will usually work just as well.

Having created a link to the database host, you can now proceed to select a database, again remembering that it will probably have a name that has been made unique by adding the subdomain or domain to the start of it. The code to select a database is as follows:

mysql_select_db('my_database') or die('Could not select database'),

Again, note the or keyword, which tests the result of the mysql_select_db function that takes the name of the database that you want to connect to as its only parameter. This name will also probably be one that has been created by the system to be intentionally unique across multiple domains.

Once the link has been established and the database selected, the next step is to try querying for data and retrieving the results.

Querying for Data

This is a two-step process:

  1. Create the query string.

  2. Send the query string to the server and test the result.

There are a few things to note. First, the query string can be built using PHP variable substitution and string combination, so any user-entered data can be taken into account. Second, the success of the query does not have any connection to the data that it might return.

Having said that, here is the very simple code to perform the query operation:

// Create the query
$query = 'SELECT date, title, summary,
            content FROM blog_entry ORDER BY date';
// Send the query
$result = mysql_query($query, $link)
            or die('Query failed: ' . mysql_error());

I used a SELECT query here, but you could use any of the queries accepted by MySQL instead. The query is built in the $query variable and then sent to the database server using the mysql_query function. The result is tested and an appropriate error message printed out if it fails.

It really is that easy. Of course, in a live installation, you’ll want to do more elaborate testing and catching of errors, but this example will work and is more than enough to get started with.

Assuming that the query has been successful, you can access the result through the $result variable.

Moving Through Records

There are many ways to retrieve the result of a query. The easiest is just to use the built-in PHP array processing functionality to fetch the result from the database as an array. This can be achieved in a loop, as follows:

while ($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
  // Do work here
}

The curious reader can go ahead and look up MYSQL_ASSOC in the PHP manual. The rest can just use it as is. The mysql_fetch_array function continues to return valid rows until there are none left. Each row that it returns is an array of values that can be accessed using the array operator, []:

$current_title = $row["title"];

This snippet assumes that you are operating on the blog_entry table created in the previous sections. If you have placed no limit on the SELECT statement, all the data will be returned, row by row, in the $row variable.

However, using the LIMIT keyword in the SELECT statement, you can page through the results, ensuring that the page is returned within an agreeable length of time. This will, of course, reduce the data processed by the while loop as well.

To implement this limit, you need to keep track of the current page and page length, and make sure that sure you pass the values through GET or POST requests. Assuming you are building a GET-based interface, this is really very easy.

First, the SQL statement becomes:

$query = 'SELECT date, title, summary, content FROM blog_entry
   ORDER BY date
   LIMIT ' . $page_length * $page_number . ',' . $page_length . ";

I have split the line across three lines for the sake of clarity. Assuming that the current page is 0 and that the page length is 10, this will create a query, thus:

SELECT date, title, summary, content
   FROM blog_entry ORDER BY date LIMIT 0, 10

The query is executed and the while loop processed as normal. It will print ten lines, if there are ten records (or more) in the database. If there are fewer than ten, then, naturally, it will print fewer than ten lines, but ten lines is the maximum.

Once it has finished, you need to allow for the paging to take place. The following code needs to be augmented with testing to check for the first and last page, but the general form for creating the URL to be sent to the server is as follows:

echo '<a href='http://mysite.com/record.php?p='
  . $page_number–1 . '>Prev</a>';
echo '||';
echo '<a href='http: //mysite.com/record.php?p='
   . $page_number + 1 . '>Next</a>';

Again, the longer lines have been split to improve readability. When the page is submitted, using one of the URLs printed by the code, the page number can be extracted from the GET variable:

$page_number = $_GET["p"];

This can then be used to populate the query. To prevent errors, it should be tested against negative numbers before the query is submitted. In addition, in order to establish the number of pages, a COUNT query can be executed against the database:

$query = 'SELECT COUNT(*) FROM blog_entry;'

When this is executed, a single row will be returned. In the array that can be accessed using the mysql_fetch_array function, the value of the count can be extracted with:

$total_records = $row[0]; // also could be $row['COUNT']

From this and the page size, the number of pages can be calculated and passed in the GET query from page to page. Thus, a fairly robust paging mechanism can be installed for the underlying data.

Finally, a Web application will also need to insert and update data in the database from time to time. Because this is no more difficult than selecting, you can probably guess how it is achieved.

Inserting and Updating Data

The PHP interface allows you to build an advanced query using values from forms and other HTML interfaces. If you assume that the PHP scripts are stored in a password-protected area of the site, you might be able to process a blog update form in the following manner:

$query = "INSERT INTO blog_entry VALUES
    (NOW(), '" . $_POST[blog_title] . "', '".
           $_POST[blog_summary] . "', '" .
           $_POST[blog_content] . "', NULL);

Once again, the line has been broken up to improve readability, but should ordinarily be a single line. Of course, you could also write these using separate statements in PHP as follows:

$query = "INSERT INTO blog_entry VALUES ";
$query .= "(NOW(), '";

$query .= $_POST[blog_title];
$query .= "', '";
$query .= $_POST[blog_summary] ;
$query .= "', '";
$query .= $_POST[blog_content] ;
$query .= "', NULL);

However, some people find this broken-up format more difficult to read and write. From this code, it should be self-explanatory what the query is going to do. However, remember two things—the result needs to be tested (to check that the query is valid) and it’s good practice to check the rows returned for additional status information.

By a similar token to the INSERT query, the UPDATE query is as follows:

$query = "UPDATE TABLE blog_entry SET content = ";
$query .= "' . $new_content . "' WHERE id = " . $blog_id;

Again, the query is executed and the result verified in exactly the same way as before. The same approach can be used for any database access that is required. As long as the query works in the SQL mode of phpMyAdmin, it will work from PHP.

Recap

Web databases are an integral part of knowing how to program for the Web because they manage most aspects of a modern Website. For this reason, you should work through the steps outlined in this chapter to create your own Web database, connect to it, and practice inserting and retrieving data.

In essence, this chapter pulls together all the parts of the Web programming paradigm—HTML, SQL, and PHP—and you should now be well prepared for the next chapter, where you look at how to create Web applications.

Most of the tools that you are going to look at in the next chapter use Web databases and will be installed with Web databases from the cPanel interface. They are also generally extendable with PHP and HTML (and styles in many cases). So knowing how these pieces interact is vital.

Using databases also makes your Website much easier to use, extend, and maintain, especially when it comes to adding content. Instead of having to build Web pages, you just put the new content in the database and it is automatically inserted where the PHP code extracts it—much like the blog_entry paging example.

There are several levels that you can insert the data to be displayed into the database—with a SQL query, through the insert row phpMyAdmin tool, or using a custom PHP front-end that is protected by username and password. The CMS that you’ll see in the next chapter uses a third option—testing the username and password against the database and maintaining a login session.

However, it is possible, as you’ve seen, to start building a simple CMS without needing to use a third-party Open Source solution such as the one presented in the next chapter. However, the reality is that it is often more efficient to take such a third-party application and customize it for your own use. This will require knowledge of the database that it is built upon.

This approach is also not always straightforward, but now that you know about databases and how they are manipulated, the next chapter will show you how they are deployed. Therefore, you’re now adequately prepared to make sense of an off-the-shelf CMS, with a little help from Chapter 8.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.164.228