We can give pandas a database connection such as the one in the previous example or a SQLAlchemy connection. We will cover the latter in the later sections of this chapter. We will load the statsmodels sunactivity data, just like in the previous chapter, Chapter 7, Signal Processing and Time Series:
DataFrame
:rows = [tuple(x) for x in df.values]
Contrary to the previous example, create a table without specifying data types:
con.execute("CREATE TABLE sunspots(year, sunactivity)")
executemany()
method executes multiple statements; in this case, we will be inserting records from a list of tuples. Insert all the rows into the table and show the row count as follows:con.executemany("INSERT INTO sunspots(year, sunactivity) VALUES (?, ?)", rows) c.execute("SELECT COUNT(*) FROM sunspots") print c.fetchone()
The number of rows in the table is printed as follows:
(309,)
rowcount
attribute of the result of an execute()
call gives the number of affected rows. This attribute is somewhat quirky and depends on your SQLite version. A SQL query, as shown in the previous code snippet, on the other hand is unambiguous. Delete the records where the number of events is more than 20:print "Deleted", con.execute("DELETE FROM sunspots where sunactivity > 20").rowcount, "rows"
The following should be printed:
Deleted 217 rows
DataFrame
with the read_sql()
function. Select the records until 1732 as follows:print read_sql("SELECT * FROM sunspots where year < 1732", con)
The end result is the following pandas DataFrame
:
year sunactivity 0 1700 5 1 1701 11 2 1702 16 3 1707 20 4 1708 10 5 1709 8 6 1710 3 7 1711 0 8 1712 0 9 1713 2 10 1714 11 11 1723 11 [12 rows x 2 columns]
Refer to the panda_access.py
file in this book's code bundle for the following code:
import statsmodels.api as sm from pandas.io.sql import read_sql import sqlite3 with sqlite3.connect(":memory:") as con: c = con.cursor() data_loader = sm.datasets.sunspots.load_pandas() df = data_loader.data rows = [tuple(x) for x in df.values] con.execute("CREATE TABLE sunspots(year, sunactivity)") con.executemany("INSERT INTO sunspots(year, sunactivity) VALUES (?, ?)", rows) c.execute("SELECT COUNT(*) FROM sunspots") print c.fetchone() print "Deleted", con.execute("DELETE FROM sunspots where sunactivity > 20").rowcount, "rows" print read_sql("SELECT * FROM sunspots where year < 1732", con) con.execute("DROP TABLE sunspots") c.close()
3.144.30.62