Saving data to a database

For this example, we are going to work with an in-memory database, which will make things simpler for us. In the source code of the book, I have left a couple of comments to show you how to generate a SQLite file, so I hope you'll explore that option as well.

You can find a free database browser for SQLite at sqlitebrowser.org. If you are not satisfied with it, you will be able to find a wide range of tools, some free, some not free, that you can use to access and manipulate a database file.

Before we dive into the code, allow me to briefly introduce the concept of a relational database.

A relational database is a database that allows you to save data following the relational model, invented in 1969 by Edgar F. Codd. In this model, data is stored in one or more tables. Each table has rows (also known as records, or tuples), each of which represents an entry in the table. Tables also have columns (also known as attributes), each of which represents an attribute of the records. Each record is identified through a unique key, more commonly known as the primary key, which is the union of one or more columns in the table. To give you an example: imagine a table called Users, with columns id, username, password, name, and surname. Such a table would be perfect to contain users of our system. Each row would represent a different user. For example, a row with the values 3, gianchub, my_wonderful_pwd, Fabrizio, and Romano, would represent my user in the system.

The reason why the model is called relational is because you can establish relations between tables. For example, if you added a table called PhoneNumbers to our fictitious database, you could insert phone numbers into it, and then, through a relation, establish which phone number belongs to which user.

In order to query a relational database, we need a special language. The main standard is called SQL, which stands for Structured Query Language. It is born out of something called relational algebra, which is a very nice family of algebras used to model data stored according to the relational model, and performing queries on it. The most common operations you can perform usually involve filtering on the rows or columns, joining tables, aggregating the results according to some criteria, and so on. To give you an example in English, a query on our imaginary database could be: Fetch all users (username, name, surname) whose username starts with "m", who have at most one phone number. In this query, we are asking for a subset of the columns in the User table. We are filtering on users by taking only those whose username starts with the letter m, and even further, only those who have at most one phone number.

Back in the days when I was a student in Padova, I spent a whole semester learning both the relational algebra semantics, and the standard SQL (amongst other things). If it wasn't for a major bicycle accident I had the day of the exam, I would say that this was one of the most fun exams I ever had to prepare.

Now, each database comes with its own flavor of SQL. They all respect the standard to some extent, but none fully does, and they are all different from one another in some respects. This poses an issue in modern software development. If our application contains SQL code, it is quite likely that if we decided to use a different database engine, or maybe a different version of the same engine, we would find our SQL code needs amending.

This can be quite painful, especially since SQL queries can become very, very complicated quite quickly. In order to alleviate this pain a little, computer scientists (bless them) have created code that maps objects of a particular language to tables of a relational database. Unsurprisingly, the name of such tools is Object-Relational Mapping (ORMs).

In modern application development, you would normally start interacting with a database by using an ORM, and should you find yourself in a situation where you can't perform a query you need to perform, through the ORM, you would then resort to using SQL directly. This is a good compromise between having no SQL at all, and using no ORM, which ultimately means specializing the code that interacts with the database, with the aforementioned disadvantages.

In this section, I'd like to show an example that leverages SQLAlchemy, the most popular Python ORM. We are going to define two models (Person and Address) which map to a table each, and then we're going to populate the database and perform a few queries on it.

Let's start with the model declarations:

# persistence/alchemy_models.py
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import (
    Column, Integer, String, ForeignKey, create_engine)
from sqlalchemy.orm import relationship

At the beginning, we import some functions and types. The first thing we need to do then is to create an engine. This engine tells SQLAlchemy about the type of database we have chosen for our example:

# persistence/alchemy_models.py
engine = create_engine('sqlite:///:memory:')
Base = declarative_base()

class Person(Base):
    __tablename__ = 'person'

    id = Column(Integer, primary_key=True)
    name = Column(String)
    age = Column(Integer)

    addresses = relationship(
        'Address',
        back_populates='person',
        order_by='Address.email',
        cascade='all, delete-orphan'
    )

    def __repr__(self):
        return f'{self.name}(id={self.id})'

class Address(Base):
    __tablename__ = 'address'

    id = Column(Integer, primary_key=True)
    email = Column(String)
    person_id = Column(ForeignKey('person.id'))
    person = relationship('Person', back_populates='addresses')

    def __str__(self):
        return self.email
    __repr__ = __str__

Base.metadata.create_all(engine)

Each model then inherits from the Base table, which in this example consists of the mere default, returned by declarative_base(). We define Person, which maps to a table called person, and exposes the attributes id, name, and age. We also declare a relationship with the Address model, by stating that accessing the addresses attribute will fetch all the entries in the address table that are related to the particular Person instance we're dealing with. The cascade option affects how creation and deletion work, but it is a more advanced concept, so I'd suggest you glide on it for now and maybe investigate more later on.

The last thing we declare is the __repr__ method, which provides us with the official string representation of an object. This is supposed to be a representation that can be used to completely reconstruct the object, but in this example, I simply use it to provide something in output. Python redirects repr(obj) to a call to obj.__repr__().

We also declare the Address model, which will contain email addresses, and a reference to the person they belong to. You can see the person_id and person attributes are both about setting a relation between the Address and Person instances. Note how I declared the __str__ method on Address, and then assigned an alias to it, called __repr__. This means that calling both repr and str on Address objects will ultimately result in calling the __str__ method. This is quite a common technique in Python, so I took the opportunity to show it to you here.

On the last line, we tell the engine to create tables in the database according to our models.

A deeper understanding of this code would require much more space than I can afford, so I encourage you to read up on database management systems (DBMS), SQL, Relational Algebra, and SQLAlchemy.

Now that we have our models, let's use them to persist some data!

Let's take a look at the following example:

# persistence/alchemy.py
from alchemy_models import Person, Address, engine
from sqlalchemy.orm import sessionmaker

Session = sessionmaker(bind=engine)
session = Session()

First we create session, which is the object we use to manage the database. Next, we proceed by creating two people:

anakin = Person(name='Anakin Skywalker', age=32)
obi1 = Person(name='Obi-Wan Kenobi', age=40)

We then add email addresses to both of them, using two different techniques. One assigns them to a list, and the other one simply appends them:

obi1.addresses = [
    Address(email='[email protected]'),
    Address(email='[email protected]'),
]

anakin.addresses.append(Address(email='[email protected]'))
anakin.addresses.append(Address(email='[email protected]'))
anakin.addresses.append(Address(email='[email protected]'))

We haven't touched the database yet. It's only when we use the session object that something actually happens in it:

session.add(anakin)
session.add(obi1)
session.commit()

Adding the two Person instances is enough to also add their addresses (this is thanks to the cascading effect). Calling commit is what actually tells SQLAlchemy to commit the transaction and save the data in the database. A transaction is an operation that provides something like a sandbox, but in a database context. As long as the transaction hasn't been committed, we can roll back any modification we have done to the database, and by so doing, revert to the state we were before starting the transaction itself. SQLAlchemy offers more complex and granular ways to deal with transactions, which you can study in its official documentation, as it is quite an advanced topic. We now query for all the people whose name starts with Obi by using like, which hooks to the LIKE operator in SQL:

obi1 = session.query(Person).filter(
    Person.name.like('Obi%')
).first()
print(obi1, obi1.addresses)

We take the first result of that query (we know we only have Obi-Wan anyway), and print it. We then fetch anakin, by using an exact match on his name (just to show you a different way of filtering):

anakin = session.query(Person).filter(
    Person.name=='Anakin Skywalker'
).first()
print(anakin, anakin.addresses)

We then capture Anakin's ID, and delete the anakin object from the global frame:

anakin_id = anakin.id
del anakin

The reason we do this is because I want to show you how to fetch an object by its ID. Before we do that, we write the display_info function, which we will use to display the full content of the database (fetched starting from the addresses, in order to demonstrate how to fetch objects by using a relation attribute in SQLAlchemy):

def display_info():
    # get all addresses first
    addresses = session.query(Address).all()

    # display results
    for address in addresses:
        print(f'{address.person.name} <{address.email}>')

    # display how many objects we have in total
    print('people: {}, addresses: {}'.format(
        session.query(Person).count(),
        session.query(Address).count())
    )

The display_info function prints all the addresses, along with the respective person's name, and, at the end, produces a final piece of information regarding the number of objects in the database. We call the function, then we fetch and delete anakin (think about Darth Vader and you won't be sad about deleting him), and then we display the info again, to verify he's actually disappeared from the database:

display_info()

anakin = session.query(Person).get(anakin_id)
session.delete(anakin)
session.commit()

display_info()

The output of all these snippets run together is the following (for your convenience, I have separated the output into four blocks, to reflect the four blocks of code that actually produce that output):

$ python alchemy.py
Obi-Wan Kenobi(id=2) [[email protected], [email protected]]

Anakin Skywalker(id=1) [[email protected], [email protected], [email protected]]

Anakin Skywalker <[email protected]>
Anakin Skywalker <[email protected]>
Anakin Skywalker <[email protected]>
Obi-Wan Kenobi <[email protected]>
Obi-Wan Kenobi <[email protected]>
people: 2, addresses: 5

Obi-Wan Kenobi <[email protected]>
Obi-Wan Kenobi <[email protected]>
people: 1, addresses: 2

As you can see from the last two blocks, deleting anakin has deleted one Person object, and the three addresses associated with it. Again, this is due to the fact that cascading took place when we deleted anakin.

This concludes our brief introduction to data persistence. It is a vast and, at times, complex domain, which I encourage you to explore learning as much theory as possible. Lack of knowledge or proper understanding, when it comes to database systems, can really bite.

Table of Contents for Saving data to a database

Create new playlist

Sign In

Sign Up

Table of Contents for
Saving data to a database