In this chapter, we introduce the MySQLdatabase management system (DBMS) and the SQL database query language for defining and manipulating databases. Using our case study, Hugh and Dave’s Online Wines, as a guide, we illustrate examples of how to use SQL. The techniques that we discuss are used to interact with a DBMS after a database has been designed. An introduction to relational modeling and design can be found in Appendix C, and a more comprehensive introduction to MySQL and SQL can be found in many of the resources that are listed in Appendix E.
In this chapter, we cover the following topics:
A short introduction to relational databases and relational modeling
A quick start guide to the winestore database and its full entity-relationship model
The MySQL command interpreter and the basic features of MySQL
Using SQL to create and modify databases, tables, and indexes
Using SQL to insert, delete, and update data
The SQL SELECT
statement for querying, with
examples of simple and advanced queries
Functions and operators in SQL and MySQL
Advanced features, including managing indexes and keys, tuning the MySQL DBMS, security, and the limitations of MySQL
We assume that you have already installed MySQL. If not, the guide in Appendix A will help you. Chapter 6 covers other selected advanced database topics that arise when writing to databases, such as supporting multiple users, transactions, and locking in MySQL. Complete examples of SQL queries and MySQL in use in a web database application can be found in Chapter 10 to Chapter 13.
The field of databases has its own terminology. Terms such as database, table, attribute, row, primary key, and relational model have specific meanings and are used throughout this chapter. In this section, we present an example of a simple database to introduce the basic components of relational databases, and we list and define selected terms used in the chapter. More detail can be found in Appendix C.
An example relational database is shown in Figure 3-1. This database stores data about wineries and the wine regions they are located in. A relational database manages data in tables, and there are two tables in this example: a winery table that manages wineries, and a region table that manages information about wine regions.
Databases are managed by a
relational database
management system (RDBMS). An RDBMS supports a database language to
create and delete databases and to manage and search data. The
database language used in almost all DBMSs is SQL, a set of
statements that define and manipulate data. After creating a
database, the most common SQL statements used are
INSERT
, UPDATE
,
DELETE
, and SELECT
, which add,
change, remove, and search data in a database, respectively.
A database table may have multiple columns, or
attributes, each of which has a name. For
example, the winery table in Figure 3-1 has four attributes, winery
ID
, winery
name
, address
, and
region
ID
. A table contains the
data as rows or records, and a row contains attribute values. The
winery table has five rows, one for each winery
managed by the database, and each row has a set of values. For
example, the first winery has a winery
ID
value of 1, the winery
name
value Moss Brothers, and an
address
of Smith Rd., and is situated in the
region
ID
numbered 3. Region 3
is a row in the region table and is Margaret
River in Western Australia.
The relationship between wineries and regions is maintained by
assigning a region
ID
to each
winery row. Managing relationships in this way is fundamental to
relational database technology, and different types of relationship
can be maintained. In this example, more than one winery can be
situated in a region—three wineries in the example are situated
in the Barossa Valley—but a winery can be situated in only one
region.
Attributes have data types. For example, in the
winery table, the winery
ID
is an integer, the winery
name
and address
are strings,
and the region
ID
is an
integer. Data types are assigned when a database is designed.
Tables usually have a primary key, which is one
or more values that uniquely identify each row in a table. The
primary key of the winery table is
winery
ID
, and the primary key
of the region table is region
ID
. Primary keys are usually indexed to provide
fast access to rows when they are searched by the primary key value.
For example, an index is used to find the details of the region row
that matches a given region
ID
in a winery table row.
Figure 3-2 shows the example database modeled using entity-relationship (ER) modeling. The winery and region tables or entities are shown as rectangles. Each entity has attributes, and the primary key is shown underlined. The relationship between the tables is shown as a diamond that connects the two tables, and in this example the relationship is annotated with an M at the winery-end of the relationship. The M indicates that there are potentially many winery rows associated with each region. Because the relationship isn’t annotated at the other end, this means that there is only one region associated with each winery. ER modeling is discussed in more detail in Appendix C.
The part of a database that stores the data. A table has columns or attributes, and the data stored in rows.
The columns in a table. All rows in table entities have the same
attributes. For example, a customer table might have the attributes
name
, address
, and
city
. Each attribute has a data type such as
string, integer, or date.
The data entries in a table. Rows contain values for each attribute. For example, a row in a customer table might contain the values “Matthew Richardson,” “Punt Road,” and “Richmond.” Rows are also known as records.
A model that uses tables to store data and manage the relationship between tables.
A software system that manages data in a database and is based on the relational model. DBMSs have several components described in detail in Chapter 1.
A query language that interacts with a DBMS. SQL is a set of statements to manage databases, tables, and data.
Restrictions or limitations on tables and attributes. For example, a wine can be produced only by one winery, an order for wine can’t exist if it isn’t associated with a customer, having a name attribute could be mandatory for a customer.
One or more attributes that contain values that uniquely identify
each row. For example, a customer table might have the primary key of
cust ID
. The cust ID
attribute
is then assigned a unique value for each customer. A
primary key is
a constraint of most tables.
A data structure used for fast access to rows in a table. An index is usually built for the primary key of each table and can then be used to quickly find a particular row. Indexes are also defined and built for other attributes when those attributes are frequently used in queries.
A technique used to describe the real-world data in terms of entities, attributes, and relationships. This is discussed in Appendix C.
A correctly designed database that is created from an ER model. There are different types or levels of normalization, and a third-normal form database is generally regarded as being an acceptably designed relational database. We discuss normalization in Appendix C.
3.15.14.98