Dataset is a Python library, which is basically a wrapper around SQLAlchemy. It claims to be so easy to use that even lazy people like it.
Install dataset
as follows:
$ sudo pip install dataset $ pip freeze|grep dataset dataset==0.5.4
Create a SQLite in-memory database and connect to it:
import dataset db = dataset.connect('sqlite:///:memory:')
Create a table called books
:
table = db["books"]
Actually, the table in the database isn't created yet, since we haven't specified any columns. We only created a related object. The table schema is created automatically from calls to the insert()
method. Give the insert()
method dictionaries with book titles:
table.insert(dict(title="NumPy Beginner's Guide", author='Ivan Idris')) table.insert(dict(title="NumPy Cookbook", author='Ivan Idris')) table.insert(dict(title="Learning NumPy", author='Ivan Idris'))
These are all excellent books, of course! The read_sql()
pandas function can query this table too:
print read_sql('SELECT * FROM books', db.executable.raw_connection())
The following is the output:
id author title 0 1 Ivan Idris NumPy Beginner's Guide 1 2 Ivan Idris NumPy Cookbook 2 3 Ivan Idris Learning NumPy [3 rows x 3 columns]
Load the sunspots data and show the first five rows as follows:
write_frame(df, "sunspots", db.executable.raw_connection()) table = db['sunspots'] for row in table.find(_limit=5): print row
The following will be printed:
OrderedDict([(u'YEAR', 1700.0), (u'SUNACTIVITY', 5.0)]) OrderedDict([(u'YEAR', 1701.0), (u'SUNACTIVITY', 11.0)]) OrderedDict([(u'YEAR', 1702.0), (u'SUNACTIVITY', 16.0)]) OrderedDict([(u'YEAR', 1703.0), (u'SUNACTIVITY', 23.0)]) OrderedDict([(u'YEAR', 1704.0), (u'SUNACTIVITY', 36.0)])
We can easily show the tables in the database with the following line:
print "Tables", db.tables
The following is the output of the preceding code:
Tables [u'books', 'sunspots']
The following is the content of the dataset_demo.py
file in this book's code bundle:
import dataset from pandas.io.sql import read_sql from pandas.io.sql import write_frame import statsmodels.api as sm db = dataset.connect('sqlite:///:memory:') table = db["books"] table.insert(dict(title="NumPy Beginner's Guide", author='Ivan Idris')) table.insert(dict(title="NumPy Cookbook", author='Ivan Idris')) table.insert(dict(title="Learning NumPy", author='Ivan Idris')) print read_sql('SELECT * FROM books', db.executable.raw_connection()) data_loader = sm.datasets.sunspots.load_pandas() df = data_loader.data write_frame(df, "sunspots", db.executable.raw_connection()) table = db['sunspots'] for row in table.find(_limit=5): print row print "Tables", db.tables
3.15.229.111