Neural networks for prediction

The use of neural networks for prediction requires the dependent/target/output variable to be numeric, and all the input/independent/feature variables can be of any type. From the ArtPiece dataset, we are going to predict what is going to be the current auction average price based on all the parameters available. Before applying a neural-network-based model, it is important to preprocess the data, by excluding the missing values and any transformation if required; hence, let's preprocess the data:

library(neuralnet)

art<- read.csv("ArtPiece_1.csv")

str(art)

#data conversion for categorical features

art$Art.Auction.House<-as.factor(art$Art.Auction.House)

art$IsGood.Purchase<-as.factor(art$IsGood.Purchase)

art$Art.Category<-as.factor(art$Art.Category)

art$Prominent.Color<-as.factor(art$Prominent.Color)

art$Brush<-as.factor(art$Brush)

art$Brush.Size<-as.factor(art$Brush.Size)

art$Brush.Finesse<-as.factor(art$Brush.Finesse)

art$Art.Nationality<-as.factor(art$Art.Nationality)

art$Top.3.artists<-as.factor(art$Top.3.artists)

art$GoodArt.check<-as.factor(art$GoodArt.check)

art$AuctionHouseGuarantee<-as.factor(art$AuctionHouseGuarantee)

art$Is.It.Online.Sale<-as.factor(art$Is.It.Online.Sale)

#data conversion for numeric features

art$Critic.Ratings<-as.numeric(art$Critic.Ratings)

art$Acq.Cost<-as.numeric(art$Acq.Cost)

art$CurrentAuctionAveragePrice<-as.numeric(art$CurrentAuctionAveragePrice)

art$CollectorsAverageprice<-as.numeric(art$CollectorsAverageprice)

art$Min.Guarantee.Cost<-as.numeric(art$Min.Guarantee.Cost)

#removing NA, Missing values from the data

fun1<-function(x){

ifelse(x=="#VALUE!",NA,x)

}

art<-as.data.frame(apply(art,2,fun1))

art<-na.omit(art)

#keeping only relevant variables for prediction

art<-art[,c("Art.Auction.House","IsGood.Purchase","Art.Category",

"Prominent.Color","Brush","Brush.Size","Brush.Finesse",

"Art.Nationality","Top.3.artists","GoodArt.check",

"AuctionHouseGuarantee","Is.It.Online.Sale","Critic.Ratings",

"Acq.Cost","CurrentAuctionAveragePrice","CollectorsAverageprice",

"Min.Guarantee.Cost")]

#creating dummy variables for the categorical variables

library(dummy)

art_dummy<-dummy(art[,c("Art.Auction.House","IsGood.Purchase","Art.Category",

"Prominent.Color","Brush","Brush.Size","Brush.Finesse",

"Art.Nationality","Top.3.artists","GoodArt.check",

"AuctionHouseGuarantee","Is.It.Online.Sale")],int=F)

art_num<-art[,c("Critic.Ratings",

"Acq.Cost","CurrentAuctionAveragePrice","CollectorsAverageprice",

"Min.Guarantee.Cost")]

art<-cbind(art_num,art_dummy)

## 70% of the sample size

smp_size <- floor(0.70 * nrow(art))

## set the seed to make your partition reproductible

set.seed(123)

train_ind <- sample(seq_len(nrow(art)), size = smp_size)

train <- art[train_ind, ]

test <- art[-train_ind, ]

fun2<-function(x){

as.numeric(x)

}

train<-as.data.frame(apply(train,2,fun2))

test<-as.data.frame(apply(test,2,fun2))

In the training dataset, there are 50,867 observations and 17 variables, and in the test dataset, there are 21,801 observations and 17 variables. The current auction average price is the dependent variable for prediction, using only four other numeric variables as features:

>fit<- neuralnet(formula = CurrentAuctionAveragePrice ~ Critic.Ratings + Acq.Cost + CollectorsAverageprice + Min.Guarantee.Cost, data = train, hidden = 15, err.fct = "sse", linear.output = F)

> fit

Call: neuralnet(formula = CurrentAuctionAveragePrice ~ Critic.Ratings + Acq.Cost + CollectorsAverageprice + Min.Guarantee.Cost, data = train, hidden = 15, err.fct = "sse", linear.output = F)

1 repetition was calculated.

Error Reached Threshold Steps

1 54179625353167 0.004727494957 23

A summary of the main results of the model is provided by result.matrix. A snapshot of the result.matrix is given as follows:

> fit$result.matrix

1

error 54179625353167.000000000000

reached.threshold 0.004727494957

steps 23.000000000000

Intercept.to.1layhid1 -0.100084491816

Critic.Ratings.to.1layhid1 0.686332945444

Acq.Cost.to.1layhid1 0.196864454378

CollectorsAverageprice.to.1layhid1 -0.793174429352

Min.Guarantee.Cost.to.1layhid1 0.528046199494

Intercept.to.1layhid2 0.973616842194

Critic.Ratings.to.1layhid2 0.839826678316

Acq.Cost.to.1layhid2 0.077798897157

CollectorsAverageprice.to.1layhid2 0.988149246218

Min.Guarantee.Cost.to.1layhid2 -0.385031389636

Intercept.to.1layhid3 -0.008367359937

Critic.Ratings.to.1layhid3 -1.409715725621

Acq.Cost.to.1layhid3 -0.384200569485

CollectorsAverageprice.to.1layhid3 -1.019243809714

Min.Guarantee.Cost.to.1layhid3 0.699876747202

Intercept.to.1layhid4 2.085203047278

Critic.Ratings.to.1layhid4 0.406934874266

Acq.Cost.to.1layhid4 1.121189503896

CollectorsAverageprice.to.1layhid4 1.405748076570

Min.Guarantee.Cost.to.1layhid4 -1.043884892202

Intercept.to.1layhid5 0.862634752109

Critic.Ratings.to.1layhid5 0.814364667751

Acq.Cost.to.1layhid5 0.502879862694

If the error function is equal to the negative log likelihood function, the error refers to the likelihood as it is used to calculate the Akaike Information Criterion (AIC). We can store the covariate and response data in a matrix:

> output<-cbind(fit$covariate,fit$result.matrix[[1]])

> head(output)

[,1] [,2] [,3] [,4] [,5]

[1,] 14953 49000 10727 5775 54179625353167

[2,] 35735 38850 9494 12418 54179625353167

[3,] 34751 43750 8738 9611 54179625353167

[4,] 31599 41615 5955 4158 54179625353167

[5,] 10437 34755 8390 4697 54179625353167

[6,] 13177 54670 13024 11921 54179625353167

To compare the results of a neural network model, we can use different tuning factors such as changing the algorithm, hidden layer, and learning rate. As an example, only four numeric features were used to generate the prediction; we could have used all the 91 features for prediction of the current auction average price variable. We can also use a different algorithm from the nnet library, as follows:

> fit<-nnet(CurrentAuctionAveragePrice~Critic.Ratings+Acq.Cost+

+ CollectorsAverageprice+Min.Guarantee.Cost,data=train,

+ size=100)

# weights: 601

initial value 108359809492660.125000

final value 108359250706334.000000

converged

> fit

a 4-100-1 network with 601 weights

inputs: Critic.Ratings Acq.Cost CollectorsAverageprice Min.Guarantee.Cost

output(s): CurrentAuctionAveragePrice

options were -

Both the libraries provide equal results; there is no difference in the model result, but to tune the results further, it is important to look at the model tuning parameters such as learning rate, hidden neurons, and so on. The following graph shows the neural network architecture:

Neural networks for prediction

The model for predicting the unseen data points can be implemented using the compute function available in the neuralnet library, and the predict function available in the nnet library.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.72.15