Pandas dataframes were originally created to operate on time series data, and luckily for us, because differencing a dataset is such a common operation in time series, it's conveniently built in. As a matter of good coding practice, however, we will wrap a function around our first-order differencing operation. Note that we will be filling any spaces where we couldn't do first- order differencing with 0. The following code illustrates this technique:
def diff_data(df):
df_diffed = df.diff()
df_diffed.fillna(0, inplace=True)
return df_diffed
By differencing the dataset, we've moved this problem, a stock problem, to a flow problem. In the cast of bitcoin, the flow can be quite large because the value of a bitcoin can change a great deal between minutes. We will address this by scaling the dataset.