Scatterplots

Until now we have observed frequencies of the relationship between categorical membership (nominal attributes) and frequencies or means. It is also useful to have a look at relationships between numerical attributes. We will rely on scatterplots for this purpose. This will require a little scripting again, as we will examine the relationships between proportions. Let me first introduce the function proportions() which will generate the proportions for us, for all of our nominal attributes. This function takes one argument, DF, and call our attributes() function by default. We could instead give as an argument the data frame with the numbers we have previously drawn and the attributes.

The body of the function computes and returns the transpose of the means of each nominal attributes:

1    proportions = function(n = 100) {
2       DF=attributes(n)
3       return(data.frame(t(colMeans(DF[3:ncol(DF)]))))
4    }

The body of this function calls our attributes() function and passes the number of roulette draws to it (line 2). It then returns a data frame which contains the transpose means of all the columns, except columns 1 and 2 (which are not of interest here).

Our next function multisample() will return a data frame containing the proportions of each attribute of each of k samples (one sample per row) of n numbers drawn. It will by default draw 100 samples of 100 numbers. After starting the function declaration on line 1, we set the seed to the provided value, or the default value on line 2. We then create a vector containing the values returned by a first call to the proportions() function. In the following loop, we append iteratively values returned by function proportions() (lines 4 to 7). Finally, we return the resulting data frame (line 8), and close the function code block (line 9).

1    multisample = function(n=100,k=100, Seed=3){
2       set.seed(Seed)
3       ColMeans.df=proportions(n)
4       for (i in 1:k-1){
5          ColMeans.df=rbind(ColMeans.df,
6             proportions(n))
7       }
8       return(ColMeans.df)
9    }

We are now able to examine the relationship between proportions of numbers using scatterplots. Scatterplots display each observation on a plane by plotting the values of two attributes. On line 1, we first create a data frame of proportions using the default arguments for multisample() function. This will not take too long to compute. Having a look at the roulette grid, one can see that 10 out of the 18 red numbers are odd. Will we be able to spot this relationship from the random drawings? We will investigate this visually. We plot the proportions of red and the proportions of even numbers using a scatterplot (lines 3 to 6). The main argument set the title of the graph (line 4). The xlab argument sets the title of the horizontal axis (line 5). The ylab argument sets the title of the vertical axis (line 6). We also add and a line (called slope) showing the direction of the relationship using abline() function on line 7.

The function here uses the coefficients of a linear model as argument. I will discuss the lm() function which provides such coefficients in the chapter about regression:

1    samples = multisample()
2    par(mfrow=c(1,1))
3    plot(samples$isOdd,samples$isRed, 
4       main = "Relationship between attributes Red and Even ", 
5       xlab = "Proportion of Even numbers", 
6       ylab = "Proportion of Red numbers")
7    abline(lm(samples$isOdd~samples$isRed))

The output is provided below:

Scatterplots

A scatterplot showing the relationship between the proportion of Red and Even numbers

The graph depicts the values on our two attributes (the proportion of even numbers on the x axis and the proportion of red numbers on the y axis) for each of our samples. The line represents the linear relationship between these two proportions; the higher the proportion of even numbers, the higher the proportion of red numbers. Again, this can be expected from the betting grid.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.87.161