Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 14. Accessing Databases

Just about every large enterprise system uses a database for storing data. For example, Amazon.com, the online retailer, needs a database to store information on each product for sale. For Python to prove capable of handling these types of enterprise applications, the language must be able to access databases.

Luckily, Python provides a database API (Application Programming Interface — how you program for the database), which is a generic API that enables you to access most databases, in spite of the databases' different native APIs. The database, or DB, API doesn't define all aspects of working with databases, so some minor differences exist. For the most part, though, you can access databases such as Oracle or MySQL from your Python scripts without worrying too much about the details of the specific databases.

Having a generic database API is very useful because you may need to switch databases or have your application work with multiple databases, and you won't want to recode major parts of your program to allow this. Normally, you can do all of this in Python without a lot of programming changes.

Even if you aren't writing the next amazon.com online site, databases provide a convenient means to persist data for longer than the program is running (so that you don't lose the data that a user has entered if you want to restart your program), query for items, and modify your data in a safe manner.

This chapter covers the two main database systems supported by Python: dbm persistent dictionaries and relational databases with the DB API. In addition, this chapter describes how to set up a database, in case you don't have one handy.

In this chapter you learn:

Using the dbm libraries to create persistent dictionaries
About relational databases
Setting up the Sqlite database
Setting up the MySQL database
Working with the Python DB API
Creating connections
Accessing data with cursors
Connecting to databases
Querying and modifying data
Working with transactions
Handling errors
Using other database tools

In many cases, you don't require a full-blown relational database. In such cases, creating a persistent dictionary using dbm files is enough.

Working with DBM Persistent Dictionaries

A persistent dictionary acts exactly like you'd expect. You can store name/value pairs in the dictionary, which are saved to a disk, and so their data will endure between various times that your program is run. So if you save data to a dictionary that's backed by a dbm, the next time you start your program you can read the value stored under a given key again, once you've loaded the dbm file. These dictionaries work like normal Python dictionaries, which are covered in Chapter 3. The main difference is that the data is written to and read from disk.

An additional difference is that the keys and the values must both be strings.

DBM, short for database manager, acts as a generic name for a number of C language libraries originally created on UNIX systems. These libraries sport names such as dbm, gdbm, ndbm, sdbm, and so on. These names correspond closely to the available modules in Python that provide the requisite functionality.

Choosing a DBM Module

Python supports a number of dbm modules. Each dbm module supports a similar interface and uses a particular C library to store the data to disk. The main difference lies in the underlying binary format of the data files on disk. Each dbm module, unfortunately, creates incompatible files. That is, if you create a dbm persistent dictionary with one dbm module, you must use the same module to read the data. None of the other modules will work with that data file.

The following table lists the dbm modules.

Module	Description
dbm	Chooses the best dbm module
dbm.dumb	Uses a simple, but portable, implementation of the dbm library
dbm.gnu	Uses the GNU dbm library

All of these libraries exist because of the history of the dbm library. Originally, this library was available only on commercial versions of UNIX. Free versions of UNIX, and later Linux, Windows, and so on, could not use the dbm library. This led to the creation of alternative libraries, such as the Berkeley UNIX library and the GNU gdbm library.

With all the incompatible file formats, this plethora of libraries can be a real pain. The dbm module, though, offers a handy alternative to choosing a specific dbm module. With the dbm module, you can let it choose for you. In general, the dbm module will choose the best implementation available on your system when creating a new persistent dictionary. When reading a file, the dbm module uses the function whichdb to make an informed guess as to which library created the data file.

Unless you need a specific advanced feature of one of the dbm libraries, use the dbm module.

Creating Persistent Dictionaries

All of the dbm modules support an open function to create a new dbm object. Once opened, you can store data in the dictionary, read data, close the dbm object (and the associated data file or files), remove items, and test for the existence of a key in the dictionary.

To open a dbm persistent dictionary, use the open function on the module you choose. For example, you can create a persistent dictionary with the dbm module.

Try It Out: Creating a Persistent Dictionary

Enter the following code and name your file dbmcreate.py:

import dbm

db = dbm.open('websites', 'c')

# Add an item.
db['www.python.org'] = 'Python home page'

print(db['www.python.org'])

# Close and save to disk.
db.close()

When you run this script, you'll see output like the following:

$ python dbmcreate.py
b 'Python home page'

How It Works

This example uses the recommended dbm module.

The open function requires the name of the dictionary to create. This name gets translated into the name of the data file or files that may already be on the disk. (The dbm module may — though not always — create more than one file, usually a file for the data and one for the index of the keys.) The name of the dictionary is treated as a base file name, including the path. Usually, the underlying dbm library will append a suffix such as .dat for data. You can find the file yourself by looking for the file named websites, most likely in your current working directory.

You should also pass the optional flag. The following table lists the available flags.

Flag	Usage
C	Opens the data file for reading and writing, creating the file if needed.
N	Opens the file for reading and writing, but always creates a new empty file. If one already exists, it will be overwritten and its contents lost.
W	Opens the file for reading and writing, but if the file doesn't exist it will not be created.

You can also pass another optional parameter, the mode. The mode holds a set of UNIX file permissions. See Chapter 8 for more on opening files.

Note

The open method of the dbm modules returns a new dbm object, which you can then use to store and retrieve data.

After you open a persistent dictionary, you can write values as you normally would with Python dictionaries, as shown in the following example:

db['www.python.org'] = 'Python home page'

Both the key and the value must be strings and can't be other objects, like numbers or python objects. Remember, however, that if you want to save an object, you can serialize it using the pickle module, as you saw in Chapter 8.

The close method closes the file or files and saves the data to disk.

Accessing Persistent Dictionaries

With the dbm modules, you can treat the object you get back from the open function as a dictionary object. Get and set values using code like the following:

db['key'] = 'value'
value = db['key']

Remember that the key and the value must both be text strings.

You can delete a value in the dictionary using del:

del db['key']

The keys method returns a list of all the keys, in the same way it would with a normal dictionary:

for key in db.keys():
    # do something...

Note

The keys method may take a long time to execute if there are a huge number of keys in the file. In addition, this method may require a lot of memory to store the potentially large list that it would create with a large file.

You can use the following script as a guide for how to program with dbm persistent dictionaries.

Try It Out: Accessing Persistent Dictionaries

Enter the following script and name the file dbmaccess.py:

import dbm

# Open existing file.
db = dbm.open('websites', 'w')

# Add another item.
db['www.wrox.com'] = 'Wrox home page'

# Verify the previous item remains.
if db['www.python.org'] != None:
    print('Found www.python.org')
else:
    print('Error: Missing item')

# Iterate over the keys. May be slow.
# May use a lot of memory.
for key in db.keys():
    print("Key =",key," value =",db[key])

del db['www.wrox.com']
print("After deleting www.wrox.com, we have:")

for key in db.keys():

print("Key =",key," value =",db[key])

# Close and save to disk.
db.close()

When you run this script, you'll see output similar to the following:

$ python dbmaccess.py
Type of dbmfile = dbhash
Found www.python.org
Key = www.wrox.com  value = Wrox home page
Key = www.python.org  value = Python home page
After deleting www.wrox.com, we have:
Key = www.python.org  value = Python home page

How It Works

This script works with a small database of website URLs and descriptions. You need to first run the dbmcreate.py example, shown previously. That example creates the dbm file and stores data in the file. The dbmaccess.py script then opens the preexisting dbm file.

The dbmaccess.py script opens the persistent dictionary websites in read/write mode. The call to the open function will generate an error if the necessary data file or files do not exist on disk in the current directory.

From the previous example, dbmcreate.py, there should be one value in the dictionary, under the key www.python.org. This example adds the Wrox website, www.wrox.com, as another key.

The script verifies that the www.python.org key exists in the dictionary, using the following code:

if db['www.python.org'] != None:
    print('Found www.python.org')
else:
    print('Error: Missing item')

Next, the script prints out all of the keys and values in the dictionary:

for key in db.keys():
    print("Key =",key," value =",db[key])

Note that there should be only these two entries.

After printing out all of the entries, the script removes one using del:

del db['www.wrox.com']

The script then prints all of the keys and values again, which should result in just one entry, as shown in the output.

Finally, the close method closes the dictionary, which involves saving all the changes to disk, so the next time the file is opened it will be in the state you left it.

As you can see, the API for working with persistent dictionaries is incredibly simple because it works like files and like dictionaries, which you're already familiar with.

Deciding When to Use DBM and When to Use a Relational Database

The dbm modules work when your data needs can be stored as key/value pairs. You can store more complicated data within key/value pairs with some imagination — for instance, by creating formatted strings that use a comma or some other character to delimit items in the strings, both on the key and the value part of the dictionary. This can be useful, but it can also be very difficult to maintain, and it can restrict you because your data is stored in an inflexible manner. Another way that you can be limited is technical: Note that some dbm libraries limit the amount of space you can use for the values (sometimes to a maximum of 1024 bytes, which is very, very little).

You can use the following guidelines to help determine which of these two types of data storage is appropriate for your needs:

If your data needs are simple, use a dbm persistent dictionary.
If you plan to store only a small amount of data, use a dbm persistent dictionary.
If you require support for transactions, use a relational database. (Transactions are when more than one thing happens at once — they let you keep your data from getting changed in one place but not in another; you get to define what happens concurrently with transactions.)
If you require complex data structures or multiple tables of linked data, use a relational database.
If you need to interface to an existing system, use that system, obviously. Chances are good this type of system will be a relational database.

Unlike the simple dbm modules, relational databases provide a far richer and more complex API.

On a side note, I should mention that there is a third type of database you can work with, though it is beyond the scope of this chapter. That third type is known as an ORM or object-relational database, and it allows for the conversion of data between type systems that are incompatible in relational databases.

Python has several options available if you wish to work with an ORM, such as SQL Object, SQLAlchemy, and even the Django ORM. For more information, visit the Python Wiki at http://wiki.python.org/moin/HigherLevelDatabaseProgramming.

Working with Relational Databases

Relational databases have been around for decades so they are a mature and well-known technology. People who work with relational databases know what they are supposed to do, and how they are supposed to work, so relational databases are the technology of choice for complex data storage.

In a relational database, data is stored in tables that can be viewed as two-dimensional data structures. The columns, or vertical part of the two-dimensional matrix, are all of the same type of data; like strings, numbers, dates, and so on. Each horizontal component of the table is made up of rows, also called records. Each row in turn is made up of columns. Typically, each record holds the information pertaining to one item, such as an audio CD, a person, a purchase order, an automobile, and so on.

For example, the following shows a simple employee table.

Empid	firstname	lastname	department	manager	phone
105	Peter	Tosh	2	45	555-5555
201	Bob	Marley	1	36	555-5551

This table holds six columns:

empid: Holds the employee ID number. Relational databases make extensive use of ID numbers where the database manages the assignment of unique numbers so that each row can be referenced with these numbers to make each row unique (even if they have identical data). We can then refer to each employee by the ID number. The ID alone provides enough information to look up the employee.
firstname: Holds the person's first name.
lastname: Holds the person's last name.
department: Holds the ID of the department in which the employee works. This would likely be a numeric ID of the department, where departments are defined in a separate table that has a unique ID for each department.
manager: Holds the employee ID of the manager of the given employee. This is sort of self-referential, because in this example, a manager is actually an employee.
phone: Holds the office phone number.

In real life, a company would likely store a lot more information about an employee, such as a taxation authority identification number (Social Security number in the U.S.), home address, and more, but not anything that's really different in principle to what you've already seen.

In this example, the column empid, the employee ID, would be used as the primary key. A primary key is a unique index for a table, where each element has to be unique because the database will use that element as the key to the given row and as the way to refer to the data in that row, in a manner similar to dictionary keys and values in Python. So, each employee needs to have a unique ID number, and once you have an ID number, you can look up any employee. So, the empid will act as the key into this table's contents.

The department column holds an ID of a department — that is, an ID of a row in another table. This ID could be considered a foreign key, because the ID acts as a key into another table. (In databases, a foreign key has a much more strict definition, so it's okay to think of it this way.)

For example, the following table shows a possible layout for the department table.

department id	name	manager
1	Development	47
2	QA	32

In these examples, the employee Peter Tosh works for department 2, the QA, or Quality Assurance, department in a dynamic world-class high-quality software development firm. Bob Marley works for department 1, the Development department.

In a large enterprise, there may be hundreds of tables in the database, with thousands or even millions of records in some tables.

Writing SQL Statements

The Structured Query Language, or SQL, defines a standard language for querying and modifying databases.

You can pronounce SQL as "sequel" or "s-q-l."

SQL supports the basic operations listed in the following table.

Operation	Usage
Select	Perform a query to search the database for specific data.
Update	Modify a row or rows, usually based on a certain condition.
Insert	Create new rows in the database.
Delete	Remove a row or rows from the database.

In general, these basic operations are called QUID, short for Query, Update, Insert, and Delete, or CRUD, short for Create, Read, Update, and Delete. SQL offers more than these basic operations, but for the most part, these are the majority of what you're going to use to write applications.

If you are not familiar with SQL, look at a SQL book or search on the Internet. You will find a huge amount of tutorial material. You can also look at the website for this book for more references to SQL resources.

SQL is important because when you access databases with the Python DB API, you must first create SQL statements and then execute these statements by having the database evaluate them. You then retrieve the results and use them. Thus, you will find yourself in the awkward position of using one language, Python, to create commands in another language, SQL.

The basic SQL syntax for the CRUD operations follows:

SELECT columns FROM tables WHERE condition ORDER BY columns ascending_or_
descending

UPDATE table SET new values WHERE condition

INSERT INTO table (columns) VALUES (values)

DELETE FROM table WHERE condition

In addition to this basic look at the available syntax, many more parameters and specifiers for each operation are optional. You can still use them with Python's DB API if you're familiar with SQL.

To insert a new row in the employee table, using the previous employee example, you can use a SQL query like the following (even though it's adding data and not getting data, the convention is that all SQL commands or statements can also be called queries):

insert into employee (empid, firstname, lastname, manager, dept, phone)
    values (3, 'Bunny', 'Wailer', 2, 2, '555-5553')

In this example, the first tuple (it's useful to think of these in Python terms, even though SQL will give these different names) holds the names of the columns in the order you are using for inserting your data. The second tuple, after the keyword values, holds the data items in the same order. Notice how SQL uses single quotes to delimit strings, and no quotes around numbers. (The phone number is different — it's actually a string because it has to be able to contain non-numbers, like dashes, periods, and plus signs, depending on how the data is entered.)

With queries, you can use shortcuts such as * to say that you want an operation to be performed using all of the columns in a table. For example, to query all of the rows in the department table, showing all of the columns for each row, you can use a query like the following:

select * from department

Note that SQL is not case-sensitive for its keywords, such as SELECT and FROM. But, some databases require table and column names to be all uppercase. It is common, therefore, to see people use SELECT and FROM and other operations in all capital letters to make them easily distinguished from other parts of the query.

This SQL statement omits the names of the columns to read and any conditions that would otherwise narrow down the data that would be returned. Thus the query will return all of the columns (from the *) and all of the rows (because there is no where clause).

You can perform a join with the select command, to query data from more than one table, but present it all in a single response. It's called a join because the data from both tables will be returned as though it was queried from a single table. For example, to extract the department name with each employee, you could perform a query like the following (all of which would need to be in one string to be a single query):

select employee.firstname, employee.lastname, department.name
from employee, department
where employee.dept =  department.departmentid
order by lastname desc

In this example, the select statement requests two columns from the employee table (the firstname and the lastname, but these are specified as coming from employee by the convention of specifying the table name and the column name in the table) and one from the department table (department.name). The order by section of the statement tells the database to order the results by the value in the lastname column, in descending order.

To simplify these queries, you can use aliases for the table names, which make them easier to type and to read (but don't change the logic or the syntax of your queries). For example, to use the alias e with the employee table, you can start a query as follows:

select e.firstname, e.lastname
from employee e
...

In this case, you must place the alias, e, after the table name in the from clause. You can also use the following format with the optional keyword as, which could be easier for you to read:

select e.firstname, e.lastname
from employee as e
...

To modify (or update) a row, use a SQL statement like the following:

update employee set manager=55 where empid=3

This example modifies the employee with an ID of 3 by setting that employee's manager to the employee with an ID of 55. As with other queries, numbers don't need to have quotes around them; however, strings would need to be quoted with single quotes.

To delete a row, use a SQL statement like the following:

delete employee where empid=42

This example deletes the employee with an ID of 42 but doesn't affect anything else in the database.

Defining Tables

When you first set up a database, you need to define the tables and the relations between them. To do this, you use the part of the SQL called the DDL, or Data Definition Language. (It defines the structure of your tables — get it?) DDL basics are pretty simple, where you use one operation to create tables, and another one to remove them:

CREATE TABLE tablename (column, type column type, ... )
DROP TABLE tablename

There is also an ALTER TABLE command to modify an existing table, but you won't need to do that for now. When you want to use this, a dedicated SQL book or web page will have more about this command.

Unfortunately, SQL is not an entirely standard language, and there are parts of it that each database handles differently. The DDL remains a part of SQL that has not been standardized. Thus, when defining tables you will find differences between the SQL dialects supported by the different databases, though the basics concepts are the same.

Setting Up a Database

In most cases when you're the programmer, you will already have a database that's up and running, perhaps even a database chosen by some other organization that you're going to have to use. For example, if you host your website with a website hosting company that provides bells and whistles, like a database, your hosting package may include access to the MySQL database. If you work for a large organization, your IT department may have already standardized on a particular database such as Oracle, DB/2, Sybase, or Informix. These latter packages are likely present in your workplace if you create enterprise applications with Python.

If you have no database at all, yet still want to work on the examples in this chapter, a good starting database is Sqlite. The main virtues of Sqlite are that it comes installed with Python, and it is simple and small, but functional. This makes it a great candidate for experimentation while you're learning, even if you have another database available to you. Just keep in mind that each database has its own quirks.

The examples in this chapter were written to work with Sqlite so that you can follow them without any external infrastructure being needed. You can easily modify these examples, though, to work with a different database. That's one of the great aspects of the Python DB API.

Using Sqlite is as simple as importing the module. The following example shows you all you need to create a database.

If you are working with another database, such as SQL Server, chances are good that a database has already been created. If not, follow the instructions from your database vendor. (A lot of the time, you can get help on tasks like this from your Database Administrator, or DBA, who would really rather have you working on a test database instead of on a production database.)

With Sqlite, creating a database is rather easy.

Try It Out: Creating an Sqlite3 Database

Enter the following script and name the file createdb.py:

import os
import sqlite3
conn=sqlite3.connect('sample_database')
cursor=conn.cursor()
# Create tables.
cursor.execute("""
create table employee

(empid integer,
firstname varchar,
lastname varchar,
dept integer,
manager integer,
phone varchar)
""")
cursor.execute("""
create table department
    (departmentid integer,
    name varchar,
    manager integer)
""")
cursor.execute("""
create table user
    (userid integer,
    username varchar,
    employeeid integer)
""")
# Create indices.
cursor.execute("""create index userid on user (userid)""")
cursor.execute("""create index empid on employee (empid)""")
cursor.execute("""create index deptid on department (departmentid)""")
cursor.execute("""create index deptfk on employee (dept)""")
cursor.execute("""create index mgr on employee (manager)""")
cursor.execute("""create index emplid on user (employeeid)""")
cursor.execute("""create index deptmgr on department (manager)""")
conn.commit()
cursor.close()
conn.close()

When you run this script, you should see no output unless the script raised an error.

How It Works

Sqlite has its own API along with the standard Python DB API. This script uses the Sqlite API, but you'll notice that this API is very similar to the DB API covered in the following section, "Using the Python Database APIs." This section briefly describes the Sqlite specific code in the creatdb.py script.

Following is the code used to create a Connection object using the Sqlite module:

conn=sqlite3.connect('sample_database')

From there, the script gets a Cursor object, covered in the section "Working with Cursors." The Cursor object is used to create three tables and define indexes on these tables.

The script calls the commit method on the connection to save all the changes to disk.

Sqlite stores all of its data in the file sample_database. After running the createdb.py script, you should see the file in your Python30 directory.

You are now ready to start working with the Python database APIs.

Using the Python Database APIs

First, some history about Python and relational databases. Python's support for relational databases started out with ad hoc solutions, with one solution written to interface with each particular database, such as Oracle. Each database module created its own API, which was highly specific to that database because each database vendor evolved its own API based on its own needs. This is hard to support, because coding for one database and trying to move it to the other gives a programmer severe heartburn, as everything needs to be completely rewritten and retested.

Over the years, though, Python has matured to support a common database, or DB, API, that's called the DB API. Specific modules enable your Python scripts to communicate with different databases, such as DB/2, PostgreSQL, and so on. All of these modules, however, support the common API, making your job a lot easier when you write scripts to access databases. This section covers this common DB API.

The DB API provides a minimal standard for working with databases, using Python structures and syntax wherever possible. This API includes the following:

Connections, which cover guidelines for how to connect to databases
Executing statements and stored procedures to query, update, insert, and delete data with cursors
Transactions, with support for committing or rolling back a transaction
Examining metadata on the database module as well as on database and table structure
Defining the types of errors

The following sections take you step by step through the Python database APIs.

Downloading Modules

You must download a separate DB API module for each database you need to access. For example, if you need to access an Oracle database as well as a MySQL database, you must download both the Oracle and the MySQL database modules.

See http://wiki.python.org/moin/DatabaseInterfaces for a listing of databases.

Modules exist for most major databases with the notable exception of Microsoft's SQL Server. You can access SQL Server using an ODBC module, though. In fact, the mxODBC module can communicate with most databases using ODBC on Windows or an ODBC bridge on UNIX (including Mac OS X) or Linux. If you need to do this, you can search for more information on these terms online to find out how other people are doing it.

Download the modules you need. Follow the instructions that come with the modules to install them.

You may need a C compiler and build environment to install some of the database modules. If you do, this will be described in the module's own documentation, which you'll need to read.

For some databases, such as Oracle, you can choose among a number of slightly different modules. You should choose the module that seems to best fit your needs or go to the website for this book and ask the authors for any recommendations if you're not sure.

Once you have verified that the necessary modules are installed, you can start working with Connections.

Creating Connections

A Connection object provides the means to communicate from your script to a database program. Note the major assumption here that the database is running in a separate process (or processes). The Python database modules connect to the database. They do not include the database application itself.

Each database module needs to provide a connect function that returns a connection object. The parameters that are passed to connect vary by the module and what is required to communicate with the database. The following table lists the most common parameters.

Parameter	Usage
Dsn	Data source name, from ODBC terminology. This usually includes the name of your database and the server where it's running.
Host	Host, or network system name, on which the database runs.
Database	Name of the database.
User	User name for connecting to the database.
Password	Password for the given user name.

For example, you can use the following code as a guide:

conn = dbmodule.connect(dsn='localhost:MYDB',user='tiger',password='scott')

Use your database module documentation to determine which parameters are needed.

With a Connection object, you can work with transactions, covered later in this chapter; close the connection to free system resources, especially on the database; and get a cursor.

Working with Cursors

A cursor is a Python object that enables you to work with the database. In database terms, the cursor is positioned at a particular location within a table or tables in the database, sort of like the cursor on your screen when you're editing a document, which is positioned at a pixel location.

To get a cursor, you need to call the cursor method on the connection object:

cursor = conn.cursor()

Once you have a cursor, you can perform operations on the database, such as inserting records.

Try It Out: Inserting Records

Enter the following script and name the file insertdata.py:

import os
import sqlite3

conn=sqlite3.connect('sample_database')
cursor = conn.cursor()

# Create employees.
cursor.execute("""
insert into employee (empid,firstname,lastname,manager,dept,phone)
values (1,'Eric','Foster-Johnson',1,1,'555-5555')""")

cursor.execute("""
insert into employee (empid,firstname,lastname,manager,dept,phone)
values (2,'Peter','Tosh',2,3,'555-5554')""")

cursor.execute("""
insert into employee (empid,firstname,lastname,manager,dept,phone)
values (3,'Bunny','Wailer',2,2,'555-5553')""")

# Create departments.
cursor.execute("""
insert into department (departmentid,name,manager)
values (1,'development',1)""")

cursor.execute("""
insert into department (departmentid,name,manager)
values (2,'qa',2)""")

cursor.execute("""
insert into department (departmentid,name,manager)
values (3,'operations',2)""")

# Create users.
cursor.execute("""
insert into user (userid,username,employeeid)
values (1,'ericfj',1)""")

cursor.execute("""
insert into user (userid,username,employeeid)
values (2,'tosh',2)""")

cursor.execute("""
insert into user (userid,username,employeeid)
values (3,'bunny',3)""")

conn.commit()

cursor.close()

conn.close()

When you run this script, you will see no output unless the script raises an error.

How It Works

The first few lines of this script set up the database connection and create a cursor object:

import os
import sqlite3
conn=sqlite3.connect('sample_database')
cursor = conn.cursor()

Note how we connect to an Sqlite database. To connect to a different database, replace this with your database-specific module, and modify the call to use the connect function from that database module, as needed.

The next several lines execute a number of SQL statements to insert rows into the three tables set up earlier: employee, department, and user. The execute method on the cursor object executes the SQL statement:

cursor.execute("""
insert into employee (empid,firstname,lastname,manager,dept,phone)
values (2,'Peter','Tosh',2,3,'555-5554')""")

This example uses a triple-quoted string to cross a number of lines as needed. You'll find that SQL commands, especially those embedded within Python scripts, are easier to understand if you can format the commands over a number of lines. This becomes more important with complex queries covered in examples later in this chapter.

To save your changes to the database, you must commit the transaction:

conn.commit()

Note that this method is called on the connection, not the cursor.

When you are done with the script, close the cursor and then the connection to free up resources. In short scripts like this, it may not seem important, but this helps the database program free its resources, as well as your Python script:

cursor.close()

conn.close()

You now have a very small amount of sample data to work with using other parts of the DB API, such as querying for data.

Try It Out: Writing a Simple Query

The following script implements a simple query that performs a join on the employee and department tables:

import os
import sqlite3
conn=sqlite3.connect('sample_database')
cursor = conn.cursor()
cursor.execute("""
select employee.firstname, employee.lastname, department.name
from employee, department
where employee.dept = department.departmentid
order by employee.lastname desc
""")
for row in cursor.fetchall():
    print(row)
cursor.close()
conn.close()

Save this script under the name simplequery.py. When you run this script, you will see output like the following:

('Bunny', 'Wailer', 'qa')
('Peter', 'Tosh', 'operations')
('Eric', 'Foster-Johnson', 'development')

How It Works

This script initializes the connection and cursor in the same manner as the previous script. This script, though, passes a simple join query to the cursor execute method. This query selects two columns from the employee table and one from the department table.

This is truly a simple query, but, even so, you'll want to format your queries so they are readable, similar to what is shown here.

When working with user interfaces, you will often need to expand IDs stored in the database to human-readable values. In this case, for example, the query expands the department ID, querying for the department name. You simply cannot expect people to remember the meaning of strange numeric IDs.

The query also orders the results by the employees' last names, in descending order. (This means that it starts at the beginning of the alphabet, which is what you'd normally expect. However, you can reverse this and have them sorted in ascending order.)

After calling the execute method, the data, if any was found, is stored in the cursor object. You can use the fetchall method to extract the data.

You can also use the fetchone method to fetch one row at a time from the results.

Note how the data appears as Python tuples:

('Bunny', 'Wailer', 'qa')
('Peter', 'Tosh', 'operations')
('Eric', 'Foster-Johnson', 'development')

You can use this example as a template to create other queries, such as the more complex join shown in the following Try It Out.

Try It Out: Writing a Complex Join

Enter this script and name the file finduser.py:

import sqlite3
conn=sqlite3.connect('sample_database')
cursor = conn.cursor()
username = 'bunny'
query = """
select u.username,e.firstname,e.lastname,m.firstname,m.lastname, d.name
from user u, employee e, employee m, department d where username=?
and u.employeeid = e.empid
and e.manager = m.empid
and e.dept = d.departmentid
"""
cursor.execute(query, (username,))
for row in cursor.fetchall():
    (username,firstname,lastname,mgr_firstname,mgr_lastname,dept) = row
    name=firstname + " " + lastname
    manager=mgr_firstname + " " + mgr_lastname
    print(username,":",name,"managed by",manager,"in",dept)
cursor.close()
conn.close()

When you run this script, you will see results like the following:

bunny : Bunny Wailer managed by Peter Tosh in qa

You need to pass the user name of a person to query from the database. This must be a valid user name of a person in the database. In this example, bunny is a user name previously inserted into the database.

How It Works

This script performs a join on all three example tables, using table-name aliases to create a shorter query. The purpose is to find a given user in the database by searching for that user name. This script also shows an example of expanding both the manager's ID to the manager's name and the department's ID to the department's name. All of this makes for more readable output.

This example also shows how you can extract data from each row into Python variables. For example:

(username,firstname,lastname,mgr_firstname,mgr_lastname,dept) = row

Note that this is really nothing new. See Chapter 3 for more on Python tuples, which is all row is.

An important new feature of this script, though, is the use of a question mark to enable you to build a query using dynamic data. When you call the execute method on the Cursor, you can pass a tuple of dynamic data, which the execute method will fill in for the question marks in the SQL statement. (This example uses a tuple of one element.) Each element in the tuple is used, in order, to replace the question marks. Thus, it is very important to have as many dynamic values as you do question marks in the SQL statement, as shown in the following example:

query = """
select u.username,e.firstname,e.lastname,m.firstname,m.lastname, d.name
from user u, employee e, employee m, department d where username=?
and u.employeeid = e.empid
and e.manager = m.empid
and e.dept = d.departmentid
"""
cursor.execute(query, (username,))

The query used in this example is very helpful when you want to start updating rows in the tables. That's because users will want to enter meaningful values. It is up to you, with your SQL statements, to translate the user input into the necessary IDs.

For example, the following script enables you to change the manager for an employee:

Personally, I'd like to make myself my own manager.

Try It Out: Updating an Employee's Manager

Enter the following script and name the file updatemgr.py:

import sqlite3
import sys
conn=sqlite3.connect('sample_database')
cursor = conn.cursor()
newmgr   = sys.argv[2]
employee = sys.argv[1]
# Query to find the employee ID.
query = """
select e.empid
from user u, employee e
where username=? and u.employeeid = e.empid
"""
cursor.execute(query,(newmgr,));
for row in cursor.fetchone():
    if (row != None):
        mgrid = row
# Note how we use the same query, but with a different name.
cursor.execute(query,(employee,));
for row in cursor.fetchone():
    if (row != None):

empid = row
# Now, modify the employee.
cursor.execute("update employee set manager=? where empid=?", (mgrid,empid))
conn.commit()
cursor.close()
conn.close()

When you run this script, you need to pass the name of the user to update, as well as the name of the manager. Both names are user names from the user table. For example:

$ python finduser.py bunny
bunny : Bunny Wailer managed by Peter Tosh in qa
$ python updatemgr.py bunny ericfj
$ python finduser.py bunny
bunny : Bunny Wailer managed by Eric Foster-Johnson in qa

How It Works

The example output shows the before and after picture of the employee row, verifying that the updatemgr.py script worked.

The updatemgr.py script expects two values from the user: the user name of the employee to update and the user name of the new manager. Both of these names must be user names stored in the database. Both names are converted into IDs using a simple query. This is not very efficient, because it involves two extra round-trips to the database. A more efficient means would be to perform an inner select statement on the update statement. For simplicity, though, the separate queries are far easier to understand.

This example also shows the use of the fetchone method on the Cursor. The final SQL statement then updates the employee row for the given user to have a new manager.

The next example uses a similar technique to terminate an employee. You can really have fun with this one (terminate your friends, your enemies, and so on).

Try It Out: Removing Employees

Enter the following script and name the file terminate.py:

import sqlite3
import sys
conn=sqlite3.connect('sample_database')
cursor = conn.cursor()
employee=sys.argv[1]
# Query to find the employee ID.
query = """
select e.empid
from user u, employee e
where username=? and u.employeeid = e.empid
"""

cursor.execute(query,(employee,));
for row in cursor.fetchone():
    if (row != None):
        empid = row
# Now, modify the employee.
cursor.execute("delete from employee where empid=?", (empid,))
conn.commit()
cursor.close()
conn.close()

When you run this script, you need to pass the user name of the person to terminate. You should see no output unless the script raises an error:

$ python finduser.py bunny
bunny : Bunny Wailer managed by Eric Foster-Johnson in qa
$ python terminate.py bunny
$ python finduser.py bunny

How It Works

This script uses the same techniques as the updatemgr.py script by performing an initial query to get the employee ID for the given user name and then using this ID in a later SQL statement. With the final SQL statement, the script deletes the employee from the employee table.

Note that this script leaves the record in the user table. Question 3 of the exercises at the end of this chapter addresses this.

Working with Transactions and Committing the Results

Each connection, while it is engaged in an action, manages a transaction. With SQL, data is not modified unless you commit a transaction. The database then guarantees that it will perform all of the modifications in the transaction or none. Thus, you will not leave your database in an uncertain and potentially erroneous state.

To commit a transaction, call the commit method of a connection:

conn.commit()

Note that the transaction methods are part of the connection class, not the cursor class.

If something goes wrong, like an exception is thrown that you can handle, you should call the rollback method to undo the effects of the incomplete transaction; this will restore the database to the state it was in before you started the transaction, guaranteed:

conn.rollback()

The capability to roll back a transaction is very important, because you can handle errors by ensuring that the database does not get changed. In addition, rollbacks are very useful for testing. You can insert, modify, and delete a number of rows as part of a unit test and then roll back the transaction to undo the effects of all the changes. This enables your unit tests to run without making any permanent changes to the database. It also enables your unit tests to be run repeatedly, because each run resets the data.

See Chapter 12 for more on testing.

Examining Module Capabilities and Metadata

The DB API defines several globals that need to be defined at the module level. You can use these globals to determine information about the database module and the features it supports. The following table lists these globals.

Global	Holds
Apilevel	Should hold '2.0' for the DB API 2.0, or '1.0' for the 1.0 API.
Paramstyle	Defines how you can indicate the placeholders for dynamic data in your SQL statements. The values include the following: 'qmark' — Use question marks, as shown in the examples in this chapter. 'numeric' — Use a positional number style, with ':1', ':2', and so on. 'named' — Use a colon and a name for each parameter, such as :name. 'format' — Use the ANSI C sprintf format codes, such as %s for a string and %d for an integer. 'pyformat' — Use the Python extended format codes, such as %(name)s.

Global

Holds

Apilevel

Should hold '2.0' for the DB API 2.0, or '1.0' for the 1.0 API.

Paramstyle

Defines how you can indicate the placeholders for dynamic data in your SQL statements. The values include the following:

'qmark' — Use question marks, as shown in the examples in this chapter.
'numeric' — Use a positional number style, with ':1', ':2', and so on.
'named' — Use a colon and a name for each parameter, such as :name.
'format' — Use the ANSI C sprintf format codes, such as %s for a string and %d for an integer.
'pyformat' — Use the Python extended format codes, such as %(name)s.

In addition, remember that pydoc is your friend. You can use pydoc to display information on modules, such as the database modules.

With a cursor object, you can check the definition attribute to see information about the data returned. This information should be a set of seven-element sequences, one for each column of result data. These sequences include the following items:

(name, type_code, display_size, internal_size, precision, scale, null_ok

None can be used for all but the first two items, as shown in this example:

(('FIRSTNAME', None, None, None, None, None, None),
('LASTNAME', None, None, None, None, None, None),
('NAME', None, None, None, None, None, None))

Handling Errors

Errors happen. With databases, errors happen a lot. The DB API defines a number of errors that must exist in each database module. The following table lists these exceptions.

Exception	Usage
Warning	Used for non-fatal issues. Must subclass StandardError.
Error	Base class for errors. Must subclass StandardError.
InterfaceError	Used for errors in the database module, not the database itself. Must subclass Error.
DatabaseError	Used for errors in the database. Must subclass Error.
DataError	Subclass of DatabaseError that refers to errors in the data.
OperationalError	Subclass of DatabaseError that refers to errors such as the loss of a connection to the database. These errors are generally outside of the control of the Python scripter.
IntegrityError	Subclass of DatabaseError for situations that would damage the relational integrity, such as uniqueness constraints or foreign keys.
InternalError	Subclass of DatabaseError that refers to errors internal to the database module, such as a cursor no longer being active.
ProgrammingError	Subclass of DatabaseError that refers to errors such as a bad table name and other things that can safely be blamed on you.
NotSupportedError	Subclass of DatabaseError that refers to trying to call unsupported functionality.

Your Python scripts should handle these errors. You can get more information about them by reading the DB API specification. See www.python.org/topics/database/ and www.python.org/peps/pep-0249.html for more information.

Summary

Databases provide a handy means for storing data. You can write Python scripts that can access all the popular databases using add-on modules. This chapter provided a whirlwind tour of SQL, the Structured Query Language, and covered Python's database APIs.

You also learned about the dbm modules that enable you to persist a dictionary using a variety of dbm libraries. These modules enable you to use dictionaries and transparently persist the data.

In addition, this chapter covered the Python database APIs, which define a standard set of methods and functions that you should expect from all database modules. This includes the following:

A connection object encapsulates a connection to the database. Use the connect function on the database module to get a new connection. The parameters you pass to the connect function may differ for each module.
A cursor provides the main object for interacting with a database. Use the connection object to get a cursor. The cursor enables you to execute SQL statements.
You can pass dynamic data as a tuple of values to the cursor execute method. These values are placed into your SQL statements, enabling you to create reusable SQL statements.
After performing a query operation, the cursor object holds the data. Use the fetchone or fetchall methods to extract the data.
After modifying the database, call commit on the connection to commit the transaction and save the changes. Use the rollback method to undo the changes.
Call close on each cursor when done. Call close on the connection when done.
The DB APIs include a defined set of exceptions. Your Python scripts should check for these exceptions to handle the variety of problems that may arise.

Chapter 15 covers XML, HTML, and XSL style sheets, technologies frequently used for web development.

Exercises

Suppose you need to write a Python script to store the pizza preferences for the workers in your department. You need to store each person's name along with that person's favorite pizza toppings. Which technologies are most appropriate to implement this script?
1. Set up a relational database such as MySQL or Sqlite.
2. Use a dbm module such as dbm.
3. Implement a web-service-backed rich web application to create a buzzword-compliant application.

Rewrite the following example query using table name aliases:

select employee.firstname, employee.lastname, department.name
from employee, department
where employee.dept =  department.departmentid
order by employee.lastname desc

The terminate.py script, shown previously, removes an employee row from the employee table; but this script is not complete. There remains a row in the user table for the same person. Modify the terminate.py script to delete both the employee and the user table rows for that user.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 14. Accessing Databases

Create new playlist

Sign In

Sign Up

Chapter 14. Accessing Databases

Working with DBM Persistent Dictionaries

Choosing a DBM Module

Creating Persistent Dictionaries

Note

Accessing Persistent Dictionaries

Note

Deciding When to Use DBM and When to Use a Relational Database

Working with Relational Databases

Writing SQL Statements

Defining Tables

Setting Up a Database

Using the Python Database APIs

Downloading Modules

Creating Connections

Working with Cursors

Working with Transactions and Committing the Results

Examining Module Capabilities and Metadata

Handling Errors

Summary

Exercises

Table of Contents for
14. Accessing Databases