Hour 18. Storing Information in Databases


What You’ll Learn in This Hour:

Image Why you should use databases

Image How to use SQL to talk to databases

Image How to create a database

Image How to query a database

Image When to use databases in the real world


There’s another way to save information to your hard drive: databases. A database is an extremely powerful tool that makes it easy not only to store data, but to use that data in multiple ways. In this chapter, we’re going to take a slight detour and learn about databases. Databases aren’t included with Python. They’re a separate kind of application we’re going to have to install.

Why Use Databases?

Databases provide a special way to store data that involves saving granular pieces of information that can be recalled later. If a text file is like a piece of paper, a database is like a filing cabinet. Instead of reading through the whole file to find the piece of information you want, you can call up exactly what you need based on how the information was stored.

If we already know how to save data in text, spreadsheets, and JSON, why should we bother with databases? For one, if you’re working with a complex dataset, databases allow you to manipulate that data without loading the entire dataset into memory. Perhaps you have a database filled with information about all the people at a company. Do you really want to read a file containing all that information, including names, titles, locations, hours, emails, and vacation days if all you want is Linda Gordon’s birthday?

Also, a database makes it easier to reuse data. You may write a database with the intention of having personnel records on everyone at a company, but another person at the same company might reuse that database to create a birthday list. Another person might use the database to get the email addresses for people only at one location to tell them that there will be building maintenance over the weekend.

Talking to Databases with SQL

SQL (Structured Query Language) is the language used to talk to databases. Don’t panic about having to learn yet another programming language! We’ll be using a limited subset of what SQL has to offer, so this won’t be like learning a new language.

SQL is written in statements. These statements tell the database what you want to do and what data you want to work with. The database then takes the data, does what you want, and either saves your requested changes or gives you the data.

For example, let’s say we have a database of restaurants in a city, but we just want a list of the restaurants in the downtown area. We can use a select statement:

SELECT name FROM restaurants WHERE neighborhood='downtown';

This would give us a list of restaurant names in our database, where we’ve set the neighborhood to 'downtown'.

We can also change existing entries, add entries, and delete entries. We’ll be covering each of these uses in turn in the rest of this hour.

Each database has its own version of SQL that contains subtle differences. Some have extra functionality. Some have more tools to deal with dates or time. A few have a slightly different format. Once you learn one, though, it’s fairly easy to read another database’s SQL statements.

There are many kinds of databases. Some are free, and some are so expensive that they’re rarely seen outside of large companies. Some require a complex setup, whereas others are extremely easy to install. For our purposes, we’ll be using a lightweight and free database called SQLite (pronounced S-Q-L lite).

SQLite on a Mac

If you’re using a Mac, you should already have SQLite on your system! To double-check, open up a terminal and type sqlite3. You should see something close to the following:

SQLite version 3.7.7 2011-06-25 16:35:41
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>

To get out of this dialog, type '.exit' (note the period) and press Return.

Installing SQLite on Windows

SQLite does not come installed on Windows, so you’ll have to install it yourself. Installing SQLite is the same as installing other programs on your computer. Most users are used to install wizards that unpack all the needed files, put them where they need to go, and then edit some system settings so that everything works.

SQLite doesn’t come with an installer like this. You’re going to have to move some things around yourself as well as edit some system settings. Don’t worry, though! If you’ve gotten this far, you’re more than capable of doing this.

First, go to the SQLite website download page (http://www.sqlite.org/download.html). There, you’ll find many links to files for a variety of operating systems. You’re going to need the first one under Precompiled Binaries for Windows. This is the one for a command-line shell, not the one that contains DLLs or is an analyzer.

Open the folder where your file downloaded. Right now, all the files you need are stored in a Zip file. You’ll need to unpack it. Double-click the file. This should create a new folder, inside of which is a file called sqlite3.

You won’t want to run SQLite from your Downloads folder (or wherever it was saved), so you’ll need to make a new home for it. We’re going to be using the File Explorer for this, so open up a new Explorer window. If you don’t have a shortcut to the Windows Explorer, search for “Windows Explorer” in your Start menu. Note that we’re not going to be using Internet Explorer.

1. First, you’ll need to find the SQLite binary. Right-click it and select Copy.

2. Find your Program Files directory. It might be pinned in your shortcuts panel, or you may have to look around for it. In general, it is at the root of your hard drive. You might have two Program Files directories. It doesn’t really matter which one you pick for this exercise.

3. There, make a new folder called “sqlite.” Open it, right-click, and then select Paste. Your SQLite binary should be copied into that folder.

Just like Python, Windows has a list of directories to check when you want to run something from the command line. You need to add the sqlite folder to your system path:

1. First, we need to copy the path. In your sqlite folder, right-click sqlite in your task bar and select Copy Address as Text.

2. On your Start menu, search for “path.” One of the items that comes up should be “Edit the system environment variables.” Select that item.

3. A dialog will pop up. Under the Advanced tab, select Environment Variables. Another dialog will pop up. Select PATH and click Edit.

4. A dialog will appear. Finally, we can edit the path. Do not erase the text currently in there. (If you accidentally erase your path, click Cancel so your changes aren’t saved.) Instead, at the end, type a semicolon (;) and then paste the path to the end of the text in the Variable value input. Click OK.

Testing SQLite

Open a command prompt, and type sqlite3. The following prompt should appear:

SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>

Type '.exit' to return to your prompt (note the period!).

Creating a Database

Normally, someone working with databases would work directly with those databases. In our case, however, we’re going to use Python. If you want to interact directly with your database, try the Firefox add-on SQLite Manager. This add-on gives you a graphical interface to any SQLite database on your computer.

Making a Table

The data you put in a database is always held in a table. A database can have one table or a hundred. How many you decide to have depends on what you want to do with the database.

You can’t just throw any data into a database. You have to tell the database what sort of data it can expect. Also, as with variables, you should give the types of data names that make sense. Table 18.1 contains all the allowed data types in SQLite.

Image

TABLE 18.1 SQLite Data Types

Let’s create a table that will contain some user data. We want to store a username, the user’s real name, and an ID number. Create a new file called create_db.py. In it, enter the following code. This is the code we’re going to use to create a table in our database.

import sqlite3

conn = sqlite3.connect('mytest.db')
cursor = conn.cursor()
sql = '''create table students (
     name text,
     username text,
     id int)'''
cursor.execute(sql)
cursor.close()

Let’s talk about what we’re doing in this script, line by line, starting with our import statement:

import sqlite3

Naturally, we have to import the sqlite3 library if we want to use SQL. The next line might seem a little strange, though:

conn = sqlite3.connect('mytest.db')

This creates a connection to the database. Like opening a file to read it, we connect to a database to use it. In this case, the database doesn’t exist yet, so a new one is created. We can use either an absolute path (a path that tells the exact location of the file) or a relative path (one that figures out where the file is based on your current working directory).

Let’s move on to the next line:

cursor = conn.cursor()

Here, we’re creating a cursor. A cursor is what we use to move around the database, execute SQL statements, and get data. The next line is broken out onto several lines:

sql = '''create table students (
     name text,
     username text,
     id int)'''

These lines set up our SQL statement. We could have this string all on one line, but it’s easier to read when it’s broken out into several lines. This statement tells SQL to create a table called students that has columns: name, username, and id. Those columns will contain text, text, and integers, respectively.

Just making a string that contains an SQL statement isn’t going to create the table, though. In the next line, we use our cursor to execute the statement:

cursor.execute(sql)

Now that we’re done, we can use our cursor to close our connection to the database:

cursor.close()

Like with files, it’s a good idea to close the connection to a database when we no longer need it.

Go ahead and run the file and then look in the directory where you saved the script. You should see a new file there: mytest.db. If you don’t, double-check what script you ran and where it’s saved.

Adding Data

Now that we have a database and a table, let’s add some data to our database. In order to add some data to our table, we’ll need to use the insert command and some special formatting. Create a new file in the same directory as your database and then enter the following code:

import sqlite3

conn = sqlite3.connect('mytest.db')
cursor = conn.cursor()
print "Let's input some students!"
while True:
     name = raw_input('Student's name: ')
     username = raw_input('Student's username: ')
     id_num = raw_input('Student's id number: ')
     sql = ''' insert into students
                 (name, username, id)
                 values
                  (:st_name, :st_username, :id_num)'''
     cursor.execute(sql, {'st_name':name, 'st_username':username, 'id_num':id_num})
     conn.commit()
     cont = raw_input("Another student? ")
     if cont[0].lower() == 'n':
         break
  cursor.close()

Just like before, we start by importing the sqlite3 library, connecting to a database, and creating a cursor.

The next part involves a loop that allows a user to enter as many students as needed. We save the name, username, and ID number into their own variables.

Next, we set up the sql statement. This time, we create an insert statement. This will insert a new line into our table and fill it with the data we give it. (You don’t have to call the string sql. You can name it whatever you want, but it should be clear to another user that you’re setting up an sql statement.)

An insert statement needs three things: what table you want to insert data into, which columns are getting data, and, of course, the data itself. In this case, we’re inserting into the table students the name, username, and ID, and we’re giving it the data stored in name, username, and id_num.

Note that we have a bit of new formatting in our insert statement: a colon, followed by a string. This is called a named parameter. This tells the SQLite library that we have an item we’d like to insert here.

Just like before, we execute our statement, but this time we’ve added a dictionary of our named parameters and the items we would like inserted. Also, we have to perform an extra step: We commit. SQLite won’t save our new data until we tell it to. If we left this out, the data given to us by the user would be lost.

Finally, we see if the user wants to add another user. If the user doesn’t, we close the connection to the database.

Querying the Database

At this point, we should have some data saved in our database. It isn’t going to do us much good if we can’t get it out, though. In order to get the data out of the database, we need to make a query.

A query is a statement that tells a database what information we want from IT, and in what order we want the information. We could get all the ID numbers, or we could get all the information about a user whose name is Neuman. We could even get all the users whose names begin with a certain letter, or all of the unique names.

A query needs at least two things: which table you want to select from, and what you want to select. The most basic select query looks like this:

select * from tablename

This tells SQLite to select everything—that’s what the asterisk (*) stands for—from the table tablename. If we wanted to select all the students, we’d want to have a select statement that looks like this:

select * from students

Now that you know how to write a basic select statement, let’s write a short script that will print out all the students we have in our database. This will require some new functions from the SQLite library, so we’ll go over each of them.

For now, enter the following into a file called get_data.py. This file should be in the same directory as your database.

import sqlite3

conn = sqlite3.connect('mytest.db')
cursor = conn.cursor()
sql = "select * from students"
results = cursor.execute(sql)
all_students = results.fetchall()
for student in all_students:
   print student

The first few lines should look familiar by now. We’re importing our library, connecting to our database, and creating a cursor. We also set up our SQL statement.

This time, when we execute our sql statement, we want to save the results. The database is going to return all the students in our database. We store those results in the variable results.

Finally, we use the fetchall function to store all the results into a new variable, all_students, and then print the students. Let’s run the file, assuming we have a few students already entered:

$ python get_data.py
(u'Hannah', u'hurricane_hannah', 1)
(u'Jacob', u'hockey_boy', 2)

When you run fetchall, Python returns a list of tuples. Each tuple contains one row from the database. In this case, we wanted all the data from all the columns, so we used an asterisk (*) to tell the database to give us everything.

What if we only wanted the names in the database? Then, rather than the asterisk, we put the name of the column. Let’s get just the students’ names by changing our sql statement:

import sqlite3

conn = sqlite3.connect('mytest.db')
cursor = conn.cursor()
sql = "select name from students"
results = cursor.execute(sql)
all_students = results.fetchall()
for student in all_students:
   print student

Now, when we run the file, we get only the students’ names, rather than their names, IDs, and usernames:

$ python get_data.py
(u'Hannah',)
(u'Jacob',)

Using Databases in the Real World

After learning about databases, the cook from our restaurant example realizes that she should probably store her inventory in a database rather than in a text or JSON file. She already knows two things she wants to do with regard to her inventory program: search for ingredients, and print a list of ingredients where she has none in stock.

She decides to write a basic program to set up her inventory database. She’s going to need to create the database and the table for the inventory and then add some ingredients.

The first function she writes opens the database and returns a connection:

def open_database(db_name):
    conn = sqlite3.connect(db_name)
    return conn

The second function creates the table. She knows she’ll get an error if she tries to create a table that already exists, so she wraps it in a try/except block:

def create_table(cursor):
    sql = '''create table ingredients (
                title text,
                amount float,
                description text)
          '''
    try:
        cursor.execute(sql)
    except:
        pass

Next is the bulk of her program. She needs to enter her ingredients, one by one, and then add them to the database:

def add_ingredients(cursor):
    while True:
        ingredient = raw_input("Name of ingredient (q to quit): ")
        if ingredient.lower() != 'q':
            num = raw_input("Number in storage: ")
            description = raw_input("description: ")
            sql = '''insert into ingredients
                (title, amount, description)
                values
                (:title, :amount, :description)'''
            cursor.execute(sql,
                {"title":ingredient, "amount":num, "description":description})
            print "Added!"
        else:
            print "Okay, quitting."
            break

Finally, she writes the main function for her program, pulling all her functions together:

def main():
    conn = open_database('inventory.db')
    cursor = conn.cursor()
    create_table(cursor)
    add_ingredients(cursor)
    conn.commit()
    conn.close()

Once she’s done, her finished program looks like this:

import sqlite3

def open_database(db_name):
    conn = sqlite3.connect(db_name)
    return conn

def create_table(cursor):
    sql = '''create table ingredients (
                title text,
                amount float,
                description text)
          '''
    try:
        cursor.execute(sql)
    except:
        pass

def add_ingredients(cursor):
    while True:
        ingredient = raw_input("Name of ingredient (q to quit): ")
        if ingredient.lower() != 'q':
            num = raw_input("Number in storage: ")
            description = raw_input("description: ")
            sql = '''insert into ingredients
                (title, amount, description)
                values
                (:title, :amount, :description)'''
            cursor.execute(sql,
                {"title":ingredient, "amount":num, "description":description})
            print "Added!"
        else:
            print "Okay, quitting."
            break

def main():
    conn = open_database('inventory.db')
    cursor = conn.cursor()
    create_table(cursor)
    add_ingredients(cursor)
    conn.commit()
    conn.close()

if __name__ == '__main__':
    main()

This program doesn’t do quite everything she wants, but it gives her a place to build from. We’ll return to it in the next hour.

Summary

During this hour, you learned what a database is. You also installed (or found) SQLite on your computer. You learned how to make a new database using a Python script, and you learned how to store information in your database. Finally, you learned how to get information out of a database.

Q&A

Q. Who uses SQLite?

A. Lots of people! The fact that it’s quite small means that it can be embedded on systems that aren’t very powerful. Right now, SQLite is most used on smartphones and browsers. iOS and Android both use SQLite to store application data, and Chrome and Firefox use SQLite to store metadata.

Q. How do I decide when to use a database?

A. At first blush, using JSON or text files may seem easier than learning yet another new tool. Databases have a number of advantages, though. You can ask them specific questions without having to load all your data into memory. They’re much faster than working with a pile of JSON or text files. You can also establish relationships between different pieces of data in a way that would be nearly impossible with text files.

Files tend to be good for storing data that you know you’re not going to have to do anything dynamic with, and for when you know you’re going to need to load up the entire thing every time you use it. So, if you have a file with Shakespeare’s The Tempest in it, you’ll probably want to keep that in a file.

Workshop

The Workshop contains quiz questions and exercises to help you solidify your understanding of the material covered. Try to answer all questions before looking at the answers that follow.

Quiz

1. What types of data can SQLite store?

2. What is a blob?

3. What is a cursor?

Answers

1. SQLite can store null values, integers, floats, strings, and blobs.

2. A blob is a binary large object. An example would be a picture, a music file, or a Zip file.

3. A cursor is the object we use to move around our database and make queries.

Exercise

You have the basic inventory program that the cook wrote. Write a function that allows her to search through the database for a particular item. You’ll need to write a select query in order to do this. If you do find the item, you should print out how many of that item you have. If you do not find that item, you should print out a message stating that the item isn’t in the inventory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.47.169