We now need to create a data frame that includes the criterion attribute (quality
), the length of the reviews, and the term-document matrix:
DF = as.data.frame(cbind(quality, lengths, SparseRemoved))
Let's now create our training and a testing dataset:
1 set.seed(123) 2 train = sample(1:2000,1000) 3 TrainDF = DF[train,] 4 TestDF = DF[-train,]
18.191.237.201