CHAPTER 4

image

Working with the Relational Model and SQLite

You’ve seen how to create a table, add rows to it, insert data into it, and then retrieve the data. Granted, there’s not much data involved so far, but these are the basics of database management. Along the way you’ve seen that SQLite (like many other libraries and data managers) is typically embedded in a framework, language, class, or database management system (DBMS). As noted, today, most DBMSs are relational database management systems (RDBMSs). They are based on the relational model first proposed in 1970 by E. F. Codd, and they typically use SQL or a derivative (like SQLite!) to create, manage, and query the relational database.

You’ve seen how to use one of the many graphical editors for SQLite (you can find them on the Web searching for “SQLite editor” and perhaps including your operating system in the search). You’ve also seen how to use basic SQLite syntax directly to create tables and work with their data. This chapter adds another tool to the mix: sqlite3 which is a command-line tool distributed as part of SQLite.

Image Note  Most databases today use the relational model, so DBMS and RDBMS are more or less interchangeable terms. This chapter lets you see what is not interchangeable: it is only about relational databases (like SQLite). A relational database is a database in which two tables can be related to one another. It is the tables that are related, and the relationship is created based on data in each of the related tables. Thus, although in most of this book DBMS is used, in this chapter, because the topic is relational data, we use the term RDBMS. Relationships connect two tables, so in order to talk about relationships we need to start with two tables.

Relationships can become very complex (and very powerful) as they are built. For that reason, some people shy away from them, but there is no reason to be concerned. This chapter provides an overview of how relationships work, why you should use them, and how easy it is to create them. Relationships usually involve two or more tables, so this chapter begins by showing you how to create two tables that are logically related. Then you’ll see how to use SQLite to actually relate them. (A relationship can actually be created from a single table back to itself. That is called a self-join.)

Building the Users Table

Consider the case of keeping score in a multiuser game. You can build on the basics from Chapter 3 to create a table like SimpleTable. In this case, the table is for users. From a data management point of view, there’s little difference between a table with names and birthplaces and a table with names and e-mail addresses. (The main difference is that the second table will be more useful in tracking game scores, which is what this example is all about.)

There’s another difference that you’ll see when you look at the table as shown in Table 4-1 and compare it to SimpleTable (Table 3-1 in Chapter 3). The explicit primary key (column PK) is missing. This table uses the SQLite built-in mechanism for creating a primary key with a unique value for each row. Keep that thought in mind, because we’ll return to it shortly.

Image Note  As noted previously, capitalization doesn’t matter in SQLite except within parentheses. By convention, SQL keywords are capitalized. In this book, names of tables and columns are typically capitalized in the text; in the sample code they may or may not be to reinforce the fact that it doesn’t matter to SQLite. In practice, it’s a very good idea to settle on a convention to use. That’s because many languages today do recognize the difference between upper- and lowercase letters. Thus, when you switch between writing SQLite code where UseR and user and USER are the same to Swift, Objective-C, Python, C, and others where case matters, you may inadvertently introduce errors for yourself or others by not adhering to a standard. (In addition, standard rules for naming and capitalization make your code easier to read and maintain in the future.)

This is a good start, and you have already seen the code to create such a table in Chapter 3. With the revised column names and data, here’s how you can create and populate the table.

sqlite> create table users (
   ...> Name char (128) not null,
   ...> email char (128)
   ...> );

sqlite> insert into users (Name, email) VALUES ("Rex", "[email protected]");
sqlite> insert into users (Name, email) VALUES ("Anni", "[email protected]");
sqlite> insert into users (Name, email) VALUES ("Toby", "[email protected]");

sqlite> select * from users;
Rex|[email protected]
Anni|[email protected]
Toby|[email protected]

sqlite>

Now you need the related table which will keep track of scores.

Building the Scores Table

Building the Scores table is basically the same code. For the sake of simplicity, you can let each user start with a score of zero. With that assumption, here is the code.

sqlite> create table scores (
   ...> Name char(128),
   ...> score integer
   ...> );

sqlite> insert into scores (name, score) VALUES ("Rex", 0);
sqlite> insert into scores (name, score) VALUES ("Anni", 0);
sqlite> insert into scores (name, score) VALUES ("Toby", 0);

sqlite> select * from scores;
Rex|0
Anni|0
Toby|0

sqlite>

Table 4-2 shows the data in Scores at this point.

You can retrieve the data with syntax you’ve used previously. To get a name and e-mail address from Users, use the following code:

Table 4-2. Scores Table

Name

Score

Rex

0

Anni

0

Toby

0

sqlite> select name, email from users where name = "Anni";
Anni|[email protected]

sqlite>

You get the score from Scores in a similar way;

sqlite> select name, score from scores where name = "Anni";
Anni|0

sqlite>

Relating the Tables

You need to have a way to refer to two tables in a single SELECT statement (that’s where aliases come into play). You also need to actually implement the relationship, and that’s where primary keys come into play again. This section covers both topics.

Using Aliases to Identify Multiple Tables in a SELECT Statement

The challenge is to tables and aliases together. You need to select the data for Anni, but there’s a problem here: Both Scores and Users have columns called Name. If you use a condition such as where name = "Anni", SQLite won’t know which Name column you mean. You can distinguish between those two columns by identifying them in your SELECT statement. It doesn’t matter how you identify them, so you could write the following:

select a.name, email, score from users a, scores b
  where a.name = "Anni" and b.name = "Anni";

You qualify each table with an alias. In the FROM clause, the alias follows the table name, and in the SELECT and WHERE clauses, the alias precedes the table name and is separated from it by a period. Thus in the SELECT command and WHERE clause, you get

select a.name
where a.name = "Anni"

and in the FROM clause you get

FROM users a

Although you can use anything, you want as the alias; it can make your code easier to read if you use the actual name or an abbreviation for it as in the following:

select users.name, email, score from users, scores where users.name = "Anni" and scores.name = "Anni";

Names that are unambiguous (they only occur in one table) don’t need aliases.

This structure is very fragile as you will recognize if you’ve done much work with data and databases. Everything relies on the names being the same in both tables. If there’s a misspelling anywhere the data won’t match. And even without typos, what happens if a name changes (this happens for a variety of real-world reasons). You could wind up with a handful of Scores records that use the earlier name and another group of records that use the later name.

Before moving on to explore this issue, it’s worthwhile to take a look at what’s already been done by SQLite behind the scenes: it will help you solve the problems that are unfolding as names change.

Using the rowid Primary Key

In Chapter 3, primary keys that uniquely identify each row in a table were discussed. They’re very useful in retrieving data, particularly in cases where the data itself may be incorrect or may have changed from what you think it is, due to revisions or other circumstances. If you don’t specify a primary key, SQLite has gone ahead and created a unique key for each row in each table. It’s called rowid, and you can see it in any table for which you have not provided a primary key. With the Users table shown in this section, you can access the primary keys with the following code:

sqlite> select name, rowid from users;
Rex|1
Anni|2
Toby|3

sqlite>

This code for Scores shows its primary keys.

sqlite> select name, rowid from scores;
Rex|1
Anni|2
Toby|3

sqlite>

Changing a Name in One Table

In order to change data that’s been stored in a row of a table, you use the UPDATE command. Thus, if you want to change Toby to George in the database, you would use syntax like the following:

sqlite> UPDATE scores
   ...> SET name = "George"
   ...> WHERE name = ’Toby’;

You’re starting to see how SQL works. The basic clauses such as WHERE and FROM are used over and over again with the same syntax. The only thing that’s new in this syntax is the UPDATE command which takes a single table name. Everything else here you have already seen.

If you try to rerun the queries from the previous section, you’ll see the data from Users and Scores.

sqlite> select * from users;
Rex|[email protected]
Anni|[email protected]
Toby|[email protected]

sqlite> select * from scores;
Rex|0
Anni|0
George|0

sqlite>

If, in fact, George and Toby are two different people, it’s correct that things don’t work properly. But if it’s the case that Toby has decided to go by his middle name of George now, George and Toby are actually the same person. How do you move all of Toby’s data to George?

Retyping or using UPDATE is one way to handle this, but the better way is to use the built-in SQLite primary key as a foreign key.

Using a Foreign Key

If you have followed along, you’ll be able to see the rowid and name values in both tables using the following syntax. For Users, here is the syntax. Table 4-3 shows the data.

sqlite> select rowid, name from users;
1|Rex
2|Anni
3|Toby

Table 4-3. Users Table

rowid

Name

E-mail

1

Rex

[email protected]

2

Anni

[email protected]

3

Toby

[email protected]

For Scores, this is the syntax:

sqlite> select rowid, name from scores;
1|Rex
2|Anni
3|George

sqlite>

Table 4-4 shows the scores.

Table 4-4. Scores Table

rowid

Name

Score

1

Rex

0

2

Anni

0

3

George

0

The solution is to use a foreign key. A foreign key is a value in one table that identifies a row in another table. In this case, you want rowid 3 in Users (Toby) to be used to match rowid 3 in Scores (George). The names are different (George and Toby), but you want that relationship to work to reflect the fact that they are two different values for the same person.

Image Note  The fact that rowid 3 is the value you want to relate from one file to another is happenstance based on the sequence in which these files have been constructed. Later in this chapter, you’ll see how those numbers can have different values, but for now, you will probably see the same values shown here.

The solution is to create a new column—it’s usually called FK or Foreign Key in the database world. Sometimes database designers give it a name that identifies its source such as scorekey or scoreid and userkey or userid. Thus, if you’re looking at the Scores table, you could see the primary key or rowid alongside a separate userkey. If you take the values of rowid in Users to be the starting point (since that’s probably the first table to be created), you then want to have a matching value for userid in Scores.

To add a column to an existing table, you use the ALTER TABLE command. You specify the name of the table to alter, the name of the column to add, and the type of that column. Here’s the syntax to use to create a new column called userid.

ALTER TABLE Scores ADD COLUMN userid integer;

To see the result of this command, use the .schema command in sqlite3. (Because it’s a sqlite3 command, it starts with a period and doesn’t require a semicolon at the end. The command takes one parameter which is the name of the table you want to examine. Here’s the result if you run it now. You’ll see that userid has been added.

sqlite> .schema scores
CREATE TABLE scores (Name char(128),
   score integer
   , userid integer);

sqlite>

Now you need to enter data for the userid column that you just added to Scores. The userid values in Scores should be the value of the corresponding record in Users. At the moment they’re the same, but you’ll soon change them.

Here is the syntax to enter data for rowid in Scores. Basically, this is the same way in which you changed the name from Toby to George.

sqlite> update scores set userid = 1 where name = ’Rex’;
sqlite> update scores set userid = 2 where name = ’Anni’;
sqlite> update scores set userid = 3 where name = ’George’;

For now, just verify that the data in Scores is as shown here (both in SQLite and in Table 4-5).

sqlite> select rowid, userid, name, score from Scores;
1|1|Rex|0
2|2|Anni|0
3|3|George|0
sqlite>

Table 4-5. Scores Table with userid Added

Tab5

The names in this table are no longer relevant because the relationship between Scores and Users is based on rowid in Users matching userid in Scores. In SQLite, the ALTER TABLE command does not let you drop a column: The recommended workaround is to drop the table and create a new one (more appropriately in many cases, export the data, import it to a new table, and then drop the old table). So, for now you can just ignore the Name column in Scores. From your perspective, Scores can look like Table 4-6 (even though there’s a no-longer-needed column in the actual table).

Table 4-6. Scores Table with userid Added and Name Ignored

rowid

userid

Score

1

1

0

2

2

0

3

3

0

Joining the Tables

The simplest way to get the data you want is to use a WHERE clause to specify that you want data where users.rowid matches scores.userid. (Remember that SQLite automatically creates users.rowid when you don’t provide a primary key, but you must enter the appropriate values of users.rowid into the scores.userid column to make the relationship work.

If you have done that (and if you’ve followed the steps in this chapter you have done so), the following code will give you what you want:

sqlite> select users.name, scores.score from users, scores where users.rowid = scores.userid;
Rex|0
Anni|0
Toby|0

sqlite>

There are several important points to notice here:

  • The names come from Users; the Name field in Scores is no longer used. For that reason, even though you still may have George as a name in Scores, userid 3, which started out as Toby, is shown as George (assuming it’s the same person with just a name change that is correct).
  • Only the matching records are shown. Add some new records to Users or Scores: they won’t show up unless the rowid in Users matches the userid in Scores.
  • If a record in Scores has no userid, it won’t show up.

Summary

You can build on this general structure, just as is, by adding more data. Experiment with combinations of new records, and matching and nonmatching ids for users and scores. Add a Scores record for an existing user so that a person has two records but with the correct names from Names.

In Chapter 5 you’ll explore more ways to enhance SELECT statements.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.152.159