Dataset – databases for lazy people

Dataset is a Python library, which is basically a wrapper around SQLAlchemy. It claims to be so easy to use that even lazy people like it.

Install dataset as follows:

$ sudo pip install dataset
$ pip freeze|grep dataset
dataset==0.5.4

Create a SQLite in-memory database and connect to it:

import dataset
db = dataset.connect('sqlite:///:memory:')

Create a table called books:

table = db["books"]

Actually, the table in the database isn't created yet, since we haven't specified any columns. We only created a related object. The table schema is created automatically from calls to the insert() method. Give the insert() method dictionaries with book titles:

table.insert(dict(title="NumPy Beginner's Guide", author='Ivan Idris'))
table.insert(dict(title="NumPy Cookbook", author='Ivan Idris'))
table.insert(dict(title="Learning NumPy", author='Ivan Idris'))

These are all excellent books, of course! The read_sql() pandas function can query this table too:

print read_sql('SELECT * FROM books', db.executable.raw_connection())

The following is the output:

   id      author                   title
0   1  Ivan Idris  NumPy Beginner's Guide
1   2  Ivan Idris          NumPy Cookbook
2   3  Ivan Idris          Learning NumPy

[3 rows x 3 columns]

Load the sunspots data and show the first five rows as follows:

write_frame(df, "sunspots", db.executable.raw_connection())
table = db['sunspots']

for row in table.find(_limit=5):
   print row

The following will be printed:

OrderedDict([(u'YEAR', 1700.0), (u'SUNACTIVITY', 5.0)])
OrderedDict([(u'YEAR', 1701.0), (u'SUNACTIVITY', 11.0)])
OrderedDict([(u'YEAR', 1702.0), (u'SUNACTIVITY', 16.0)])
OrderedDict([(u'YEAR', 1703.0), (u'SUNACTIVITY', 23.0)])
OrderedDict([(u'YEAR', 1704.0), (u'SUNACTIVITY', 36.0)])

We can easily show the tables in the database with the following line:

print "Tables", db.tables

The following is the output of the preceding code:

Tables [u'books', 'sunspots']

The following is the content of the dataset_demo.py file in this book's code bundle:

import dataset
from pandas.io.sql import read_sql
from pandas.io.sql import write_frame
import statsmodels.api as sm

db = dataset.connect('sqlite:///:memory:')
table = db["books"]
table.insert(dict(title="NumPy Beginner's Guide", author='Ivan Idris'))
table.insert(dict(title="NumPy Cookbook", author='Ivan Idris'))
table.insert(dict(title="Learning NumPy", author='Ivan Idris'))
print read_sql('SELECT * FROM books', db.executable.raw_connection()) 

data_loader = sm.datasets.sunspots.load_pandas()
df = data_loader.data
write_frame(df, "sunspots", db.executable.raw_connection()) 
table = db['sunspots']

for row in table.find(_limit=5):
   print row

print "Tables", db.tables
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.229.111