How to do it...

The house_prices dataset that we will use here, has a dependent variable that is the house price. The independent variables are: the size, the number of bathrooms, the number of bedrooms, the number of entrances, the size of the balcony, and the size of the entrance. We will find the best transformations for the dependent and independent variables that get the best model. 

  1. First, we load the library, and the data. We then split the data into x (independent variables) and y (dependent variables), as shown in the following example:
library("acepack")
data = read.csv("./house_prices.csv")
x = data[,2:7]
y = data[,1]
  1. Next, we run our initial ordinary least squares, because we want to see what is the standard R2 that we get there. We get an R2 of 0.86, as shown in the following example:
lm_model = lm(data=data,Property_price~ size + number.bathrooms + number.bedrooms + number.entrances + size_balcony + size_entrance)
summary(lm_model)

The preceding command displays the following output: 

  1. We now call the ace function from the acepack package. We need two parameters: the independent variables and the dependent one. The R2 increases slightly (almost 3%) with respect to a linear model, as shown in the following example:
ace_model = ace(x,y)
ace_model$rsq

The following screenshot shows the best R-square via acepack:

  1. Let's check how the transformed variables look like (the untransformed variable versus the transformed variable) for two cases. The size of the entrance got transformed, but size_balcony was left almost untransformed:
par(mfrow=c(1,2)) 
plot(ace_model$x[1,],ace_model$tx[,1],xlab="untransformed size_entrance",ylab="transformed size_entrance")
plot(ace_model$x[5,],ace_model$tx[,5],xlab="untransformed size_balcony",ylab="transformed size_balcony")

The preceding commands display the following output of the transformed/untransformed plots for the entrance and balcony

  1. We can now compare how entrance_size relates to the dependent variable (we do this for the original variable and the transformed one) in the following way: 
plot(ace_model$x[1,], ace_model$ty) 
plot(ace_model$tx[,1], ace_model$ty)

The following screenshot shows the original relationship (on the left) and transformed relationship (both y and x transformed):

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.199.191