A quick demo of Python's capabilities

To start, let me arouse your interest by showing you some analytical and graphical capabilities of Python. I have explained the code just briefly in this section; you will learn more about Python programming in the rest of this chapter. The following code imports the libraries required for this demonstration:

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
Please note that Python is indent sensitive. Every space counts. You should be very careful especially when you use commands that span multiple lines. Just leave the default indent of a new line created by VS; if you delete or add a space, the code does not work.

Then we need some data. I am using the same data from the AdventureWorksDW2014 demo database and the dbo.vTargetMail view as in the R chapters, Chapter 13Supporting R in SQL Server, and Chapter 14, Data Exploration and Predictive Modeling with R, of this book. The following code reads this data from the CSV file:

TM = pd.read_csv("C:SQL2017DevGuideChapter15_TM.csv") 

Now I can do a quick cross tabulation of the NumberCarsOwned variable using the TotalChildren variable, with the help of the following code:

obb = pd.crosstab(TM.NumberCarsOwned, TM.TotalChildren) 
obb 

And here are the first results, a pivot table of the previously mentioned variables:

    TotalChildren       0     1     2     3     4    5
    NumberCarsOwned                                   
    0                 990  1668   602   419   449  110
    1                1747  1523   967   290   286   70
    2                1752   162  1876  1047  1064  556
    3                 384   130   182   157   339  453
    4                 292   136   152   281   165  235
  

Now, let me show you the results of the pivot table in a graph. I need just the following two lines:

obb.plot(kind = 'bar') 
plt.show() 

You can see the graph in the following figure:

NumberCarsOwned and TotalChildren pivot table

It is quite simple to create even more complex graphs. The following code shows the distribution of the Age variable in histograms and with a kernel density plot:

(TM['Age'] - 20).hist(bins = 25, normed = True,
                      color = 'lightblue') 
(TM['Age'] - 20).plot(kind='kde', style='r--', xlim = [0, 80]) 
plt.show() 

You can see the results in the following figure. Note that in the code, I subtracted 20 from the actual age, to get a slightly younger population than exists in the demo database:

Distribution of age

I hope that you are interested in learning Python after this brief introduction of its capabilities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.146.47