How it works...

Steps 1 and 2 of the How to do it… section import the modules we'll use later and read the data from the CSV file. The data is transformed into a list to allow us to iterate through it several times, as that's necessary in step 3.

Step 3 prepares the data in two arrays, and then uses .scatter to plot them. The parameters for .scatter, as with other methods of matplotlib, require an array of X and Y values. They both need to have the same size. The data is converted into float from the file format, to ensure the number format.

Step 4 refines the way the data is presented on each of the axis. The same operation is presented twice—a function is created that define how the values on that axis should be displayed (in dollars or in minutes). The function accepts as input the value to display and the position. Typically, the position will be ignored. The axis formatter will be overwritten with .set_major_formatter. Notice that both axes are returned with .gca (get current axes).

A label is added to the axes with .xlabel and .ylabel.

Finally, step 5 displays the graph in a new window. Analyzing the result, we can say that there seem to be two kinds of users, ones who spend less that 10 minutes and never spend more than $10, and users who spend more time and also have a higher chance of spending up to $100.

Note that the data presented is synthetic, and it has been generated with the result in mind. Real-life data will probably look more spread out.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.199.152