Writing the R code for Apriori

The Apriori algorithm, as explained earlier, allows users to find relationships or patterns inherent in a dataset. For this, we will use the arules package in R/RStudio. The code will read the dataset downloaded (called cms2016_2.csv in the example) and run the apriori algorithm in order to find associative rules.

Create a new R file in RStudio and enter the following code. Make sure that you change the location of the csv file that you downloaded to the appropriate directory where the file has been stored:

library(data.table) 
library(arules) 
 
cms<- fread("~/cms2016_2.csv") # CHANGE THIS TO YOUR LOCATION OF THE DATA 
 
cols <- c("category","city","company","firstName","lastName","paymentNature","product") 
 
cms[ ,(cols) := lapply(.SD, toupper), .SDcols = cols] 
 
cms[,payment:=as.numeric(payment)] 
 
quantile_values<- quantile(cms$payment,seq(0,1,.25)) 
interval_values<- findInterval(cms$payment,quantile_values,rightmost.closed=TRUE) 
 
cms[,quantileVal:=factor(interval_values, labels=c("0-25","25-50","50-75","75-100"))] 
 
rules_cols<- c("category","city","company","paymentNature","product","quantileVal") 
 
cms[ ,(rules_cols) := lapply(.SD, factor), .SDcols = rules_cols] 
 
cms_factor<- cms[,.(category,city,company,paymentNature,product,quantileVal)] 
 
rhsVal<- paste0("quantileVal","=",c("0-25","25-50","50-75","75-100")) 
 
cms_rules<- apriori(cms_factor,parameter=list(supp=0.001,conf=0.25,target="rules",minlen=3)) 
 
cms_rules_dt<- data.table(as(cms_rules,"data.frame")) 
cms_rules_dt[, c("LHS", "RHS") := tstrsplit(rules, "=>", fixed=TRUE)] 
num_cols<- c("support","confidence","lift") 
cms_rules_dt[,(num_cols) := lapply(.SD, function(x){round(x,2)}), .SDcols = num_cols] 
 
saveRDS(cms_rules_dt,"cms_rules_dt.rds") 
saveRDS(cms_factor,"cms_factor_dt.rds") 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.71.211