Importing required libraries

The first step is to import all the required libraries. This can be done as follows:

#import the packages
import math
import pandas as pd
import numpy as np
import os
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn import preprocessing
import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets
style.use('ggplot')
import matplotlib.pylab as plt
from scipy.stats import linregress
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 15, 6
from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
import sklearn
from sklearn.metrics import mean_squared_error
from sklearn import datasets, linear_model

Now let's import the Bitcoin data into our notebook. Download and place the Bitcoin data in the same drive as the notebook. You can find the current working directory using the following command:

> os.getcwd()

It should give you the current directory the files are in. The output should look something like this:

Figure 12.1: Output of the getcwd command

Once you have the current path, you can place the CSV file in the directory. The Bitcoin data can be found online in Kaggle, or it can be used from the GitHub repository inside CH12/Bitcoin/bitcoin.csv:

df=pd.read_csv("bitcoin.csv", index_col=None)

Once the CSV file is imported correctly, let's look at the first 10 entries to see how our data looks:

df.head(10)

The output looks like this:

Figure 12.2: First 10 entries in the Bitcoin data

Figure 12.2 shows the first 10 entries from the CSV file. There are eight columns, excluding the index. Each of them has a specific meaning:

  • Timestamp (in Unix time)
  • Open: Bitcoin price in currency units at the opening of the time period
  • High: Highest Bitcoin price in currency units during the time period
  • Low: Lowest Bitcoin price in currency units during the time period
  • Close: Bitcoin price in currency units at the closing of the time period
  • Volume_(BTC): Volume of BTC transacted in the time period
  • Volume_(Currency): Volume of currency transacted in the time period
  • Volume-weighted average price (VWAP)

Now let's save the variables into their respective variables:

price=df['Weighted_Price']  
time=df['Timestamp']
df2=DataFrame({'price':price,'time':time})

In order to get simple statistics of any column, there is a function, describe(), that gives us detailed statistics of any particular column:

#descriptive statistics
df2.price.describe()

The output should look like this:

Figure 12.3: Describing price to get statistics

In the same way, we can get the statistics for all the numeric columns. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.80.209