Relational databases

Relational databases have tables that can be written and accessed in many different ways without having the need to reorganize the database structure. They usually use Structured Query Language (SQL). The most widely used databases are Microsoft SQL Server, PostgreSQL, MySQL, and Oracle.

Python has many libraries capable of using any of these databases. We will use PostGresSQL as an example. The PostGresSQL library, Psycopg2, is used by Python to handle any SQL queries:

  1. We will use the GOOG data prices to create the database for GOOG data:
goog_data.head(10)
High Low Open Close Volume Adj Close
Date 2014-01-02 555.263550 550.549194 554.125916 552.963501 3666400.0 552.963501
2014-01-03 554.856201 548.894958 553.897461 548.929749 3355000.0 548.929749
2014-01-06 555.814941 549.645081 552.908875 555.049927 3561600.0 555.049927
2014-01-07 566.162659 556.957520 558.865112 565.750366 5138400.0 565.750366
2014-01-08 569.953003 562.983337 569.297241 566.927673 4514100.0 566.927673
2014-01-09 568.413025 559.143311 568.025513 561.468201 4196000.0 561.468201
2014-01-10 565.859619 557.499023 565.859619 561.438354 4314700.0 561.438354
2014-01-13 569.749329 554.975403 559.595398 557.861633 4869100.0 557.861633
2014-01-14 571.781128 560.400146 565.298279 570.986267 4997400.0 570.986267
2014-01-15 573.768188 568.199402 572.769714 570.598816 3925700.0 570.598816
  1. To create a table in SQL, we will use the following command. You will need to install PostGresSQL on your machine. Then, you will need to insert the following content:
CREATE TABLE "GOOG"
(
dt timestamp without time zone NOT NULL,
high numeric NOT NULL,
low numeric NOT NULL,
open numeric NOT NULL,
close numeric NOT NULL,
volume numeric NOT NULL,
adj_close numeric NOT NULL
CONSTRAINT
"GOOG_pkey" PRIMARY KEY (dt)
);

This command will create a SQL table named GOOG. The primary key of this table will be the timestamp, dt.

  1. As an example, we will run the following query to get the GOOG data from 2016-11-08 to 2016-11-09:
SQL = '''SELECT  
dt,high,low,open,close,volume, adj_close
FROM "GOOG"
WHERE dt BETWEEN '2016-11-08' AND '2016-11-09'
ORDER BY dt
LIMIT 100;'''

The Python code will be the following:

import psycopg2
conn = psycopg2.connect(database='name_of_your_database') # set the appropriate credentials
cursor = conn.cursor()
def query_ticks():
cursor.execute(SQL)
data = cursor.fetchall()
return data

The query_ticks function will return the GOOG data.

The main issue with a relational database is speed. They are not made to work with large amounts of data indexed by time. To speed up, we will need to use non-relational databases.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.244.201