Drawing scatter plots with colored markers

If you have two variables and want to spot the correlation between those, a scatter plot may be the solution to spot patterns.

This type of plot is also very usable as a start for more advanced visualization of multidimensional data (for example, to plot a scatter plot matrix).

Getting ready

Scatter plots display values for two sets of data. The data visualization is done as a collection of points not connected by lines. Each of them has its coordinates determined by the value of the variables. One variable is controlled (independent variable), while the other variable is measured (dependent variable) and is often plotted on the y axis.

How to do it...

Here's a code sample that plots two plots: one with uncorrelated data and the other with strong positive correlation:

import matplotlib.pyplot as plt
import numpy as np

# generate x values
x = np.random.randn(1000)

# random measurements, no correlation
y1 = np.random.randn(len(x))

# strong correlation
y2 = 1.2 + np.exp(x)

ax1 = plt.subplot(121)
plt.scatter(x, y1, color='indigo', alpha=0.3, edgecolors='white', label='no correl')
plt.xlabel('no correlation')
plt.grid(True)
plt.legend()

ax2 = plt.subplot(122, sharey=ax1, sharex=ax1)
plt.scatter(x, y2, color='green', alpha=0.3, edgecolors='grey', label='correl')
plt.xlabel('strong correlation')
plt.grid(True)
plt.legend()

plt.show()

Here, we also use more parameters such as color for setting the color of the plot, marker for using as a point marker (the default is circle), alpha (alpha transparency), edgecolors (color of the marker edge), and label (for legend box).

These are the plots we get:

How to do it...

How it works...

A scatter plot is often used to identify potential association between two variables, and it's often drawn before working on a fitting regression function. It gives a good visual picture of the correlation, particularly for nonlinear relationships. matplotlib provides the scatter() function to plot x versus y—unidimensional array of the same length as a scatter plot.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.21.47