Compared to other classification algorithms, the learned model for a rule-based classification is set up by an IF-THEN rules set. The rules set can be transformed from the decision tree or by the following algorithm. An IF-THEN rule has the following format:
IF condition_holds_true THEN make_a_conclusion
An alternative format is as follows:
For a given instance or record in the source dataset, if the RULE
antecedent holds true, the rule is defined to cover the instance, and it is satisfied.
Given a rule R
, the coverage and accuracy are defined as follows:
It is very convenient to transform the decision tree into a decision rules set for further processing. Along with every path from the root to a leaf in the decision tree, a rule can be written. The left-hand side, the rule antecedent, of any rule is constructed by the combination of the label of the nodes and the labels of the arcs, then the rule consequent by the leaf node. One example of extracting classification rules from the decision tree is illustrated in the following diagram:
One important question is the pruning of the resulting rules set.
Rules are learned sequentially, one at a time. Here is the pseudocode of the algorithm to build a rule-based classifier. The LearnOneRule
function is designed with the greedy strategy. Its target is to cover the positive instance in the source dataset as much as possible, and none or as few as possible of the negative instance at the same time. All the instances in the source dataset with a specific class are defined as positive, and those that belong to other classes are considered to be negative. An initial rule r
is generated, which keeps refining until the stop condition is met.
The pseudocode of the generic sequential covering algorithm is as follows. The input parameters include the dataset with class-labeled tuples and the attributes set with all of their possible values. The output is a set of IF-THEN rules as follows:
Repeated Incremental Pruning to Produce Error Reduction (RIPPER) is a direct rule-based classifier, in which the rule set is relatively convenient to interpret and the most practical for imbalance problems.
As per the growth of a rule, the algorithm starts from an empty rule and adds conjuncts, which maximize or improve the information gain measure, that is, the FOIL. It stops at the situation so that the rule does not cover negative rules. The resulting rule is pruned immediately with incremental reduced error pruning. Any final sequence of conditions is removed once it maximizes the measure of pruning v
, which is calculated as follows:
The sequential covering algorithm is used to build a rule set; the new description length (DL) is computed once a new rule is added to the rule set. The rule set is then optimized.
Given are p as the number of positive examples covered by this rule and n as the number of negative rules covered by this rule. P denotes the number of positive examples of this class, and N the number of the negative examples of this class.
The pseudocode of the RIPPER algorithm is as follows:
The R code for the rule-based classification is listed as follows:
1 SequentialCovering <- function(data,x,classes){ 2 rule.set <- NULL 3 4 classes.size <- GetCount(classes) 5 idx <- 0 6 while( idx <= classes.size ){ 7 idx <- idx+1 8 one.class <- GetAt(classes,idx) 9 repeat{ 10 one.rule <- LearnOneRule(newdata,x,one.class) 11 data <- FilterData(data,one.rule) 12 AddRule(rule.set,one.rule) 13 if(CheckTermination(data,x,classes,rule.set)){ 14 break; 15 } 16 } 17 } 18 return(rule.set) 19 }
One example is chosen to apply the rule-based classification algorithm, in the following section.
During computer game progressing and in the game context, improving the experience of a game is always a continual task. Classification of player types is one major task, which in turn brings more improvements including game design.
One of the popular player models of typological of temperature is the DGD player topology, which is illustrated in the following diagram. Given this model, the game players can be labeled with appropriate types, the game can be explained, it helps in designing new games, and so forth.
Based on the player behaviors or models, we can train the decision tree model with the dataset, and the rules set from the trained decision tree model. The dataset will come from the game log and some predefined domain knowledge.
3.138.124.135