where prok=|Par(vk,vk)(AD,AD)(U,U)|/|(U,U)| denotes the ratio of the number of instances belonging to class (vk,vk) in instance set (U,U) to the total set of instances.

We set T(Ai,Ai)(U,U)=1Cer(Ai,Ai)(U,U)=1k=1nprokCer(Ai,Ai)k(U,U) as the attribute selection criteria in the decision tree classification. The range of T(Ai,Ai)(U,U)

If T(Ai,Ai)(U,U)=0, the conditional attribute (Ai,Ai) has a degree of certainty of 1 for the entire set of instances, indicating that the conditional attribute (Ai,Ai) can definitely divide the instance set. If Γ(Ai,Ai)(U,U)=1, we have a maximum indefinite partition of the instance set in conditional attribute (Ai,Ai). Therefore, the smaller the value of Γ(Ai,A)(U,U), the better the attributes divide the instance set. In the attribute selection algorithm, the condition attribute (Ai,Ai), which enables Γ(Ai,Ai)(U,U) to obtain the minimum value, is selected as the branch attribute.

1. Basic idea and description of the algorithm

In the process of constructing the decision tree, branch attribute (Ai,Ai) should be able to cover as many instances as possible, reducing the number of instances to be classified. This effectively reduces the whole branch of the tree and achieves the purpose of simplification. The attributes of min (Γ(Ai,Ai)(U,U)) satisfy the above conditions, and the contribution of each partition to the entire set of decision instances is considered more comprehensively by using prok. We select the best attribute from the available set of attributes to create the branch, and the attributes that have been used to branch can no longer be selected.

The basic description of the algorithm is as follows:

(1)For each condition attribute, calculate Div(Ai,Ai)(U,U),Div(Ai,Ai)(U,U)={k=1|VAi,VAi|Par(vk,vk)(Ai,Ai)(U,U);

(2)Compute the degree of certainty Cer(Ai,Ai)j(U,U) of condition attribute (Ai,Ai) classifying the instance into class (vj,vj), and the ratio of the number of instances with decision attribute (vj,vj) to the total number of instances;

(3)Compute Γ(Ai,Ai)(U,U), select the attribute that satisfies min (Γ(Ai,Ai)(U,U)) the branch attribute, then divide the instance set as {Par(v1,v1)(Ai,Ai)(U,U),Par(v2,v2)(Ai,Ai)(U,U),,Par(vk,vk)(Ai,Ai)(U,U)};

(4)Determine whether the instances contained in Par(t,t)(At,At)(U,U) belong to the same category. If so, the corresponding node is a leaf node; otherwise, return to (1) and repeat the first three steps. The attributes that have been used to branch are no longer selected.

The pseudocode is described in Algorithm 3.1.

Algorithm 3.1 Procedure buildTree(TreeNode,U)

2. Algorithm application examples

We use the data in [29] (see Tab. 3.3) to illustrate the flow of the proposed classification attribute selection algorithm. First, the data of the source data table are expressed by dynamic fuzzy mathematics theory. The DFDT is then generated by classifying the training set according to the steps of the algorithm.

The data in Tab. 3.4 are classified according to the algorithm described in Section 3.2.4.1. For convenience, (a,a) replaces Outlook = Sunny, (b,b) replaces Temperature = Hot, (c,c) replaces Humidity = High, (d,d) replaces Wind = Strong, (e,e) replaces PlayTennis = Yes, and (U,U) denotes the instance set in Tab. 3.4. We substitute instances with the numbers in the instance set to simplify the representation (same below). That is, (U,U)={D1,D2,,D14}. The process is as follows.

First, the attribute set (U,U) is divided:

Div(a.a)(U,U)={{1,2,8,9,11},{3,7,12,13},{4,5,6,10,14}}Div(b.b)(U,U)={{1,2,3,13},{4,8,10,11,12,14},{5,6,7,9}}Div(c.c)(U,U)={{1,2,3,4,8,12,14},{5,6,7,9,10,11,13}}Div(d.d)(U,U)={{2,6,7,11,12,14},{1,3,4,5,8,9,10,13}}Div(a.a)(U,U)={{3,4,5,7,9,10,11,12,13},{1,2,6,8,14}}

Tab. 3.3: Training instance set of PlayTennis.

Tab. 3.4: The training instance set PlayTennis after dynamic fuzzification.

The degree of certainty of the classification of each attribute on (U,U) is then calculated:

Γ(a.a)(U,U)=1k=12prokCer(a.a)j(U,U)=1(414×914+0)=0.8163.

Similarly, Γ(b,b)(U,U)=1,Γ(c,c)(U,U)=1.

We select attribute (a,a) as the first branch point, that is, the root node.

As the instances in Par0.4(a,b)(U,U)={3,7,12,13} belong to the same class, the branch is ended. Look at the other classes.

When(a,a)=0.7,Div(b.b)(Par(a.a)0.7(U,U))={{1,2},{8,11},{9}}Div(c.c)(Par(a.a)0.7(U,U))={{1,2,8},{9,11}}Div(d.d)(Par(a.a)0.7(U,U))={{1,8,9},{2,11}}Div(e.e)(Par(a.a)0.7(U,U))={{1,2,8},{9,11}}Γ(b.b)(Par(a.a)0.7(U,U))=0.5,Γ(c.c)(Par(a.a)0.7(U,U))=0,Γ(d.d)(Par(a.a)0.7(U,U)=1.

Therefore, when (a,a)=0.7, the branch attribute is (c,c) and the instances in Par0.4¯(c,c)(Par0.7¯(c,c)Par0.6(c,c)(Par0.7¯(a¯,a¯))) are in the same class. Thus, the branch is ended.

Similarly, when (a,b)=0.2,Γ(b,a)(Par0.2(a,a)(U,U)=1,Γ(c,c)(Par0.2(a,b)(U,U)))=1,Γ(d,d)(Par0.2(a,d)(U,U))=0.

Hence, when (a,a)=0.2, the branch attribute is (d,d) and the instances in Par0.8(d,d)(Par0.2(a,d)(U,U)),Par0.1(d,d)(Par0.2(a,a)(U,U)) are in the same class. Again, the branch is ended.

After constructing the decision tree above, we can see that all leaf nodes corresponding to the instance set belong to the same class. This completes the tree structure. The whole instance set is classified and the constructed decision tree is shown in Fig. 3.4.

3.2.5Dynamic fuzzy binary decision tree

A test attribute with multiple attribute values may have two or more branches per internal node. A DFDT with only two branches per internal node is called a Dynamic Fuzzy Binary Decision Tree (DFBDT). There are a number of algorithms that can create binary decision trees. For example, the CRAT algorithm uses entropy to select the best splitting attributes and criteria and produces only two child nodes where each subclass has children [30]. Sun and Zhang [31] described a binary decision tree generation algorithm based on attribute-value information gain optimization and a method of transforming multi-valued attributes into binary attributes.

Fig. 3.4: Decision tree generated from Tab. 3.4.

DFBDT is a special form of DFDT and has a slightly different tree structure. By introducing, the relationship between the corresponding subset of the nodes of the decision tree can be analysed. At the same time, each layer of the decision tree corresponding to the instance set is divided according to a certain relationship. DFBDT is illustrated in Fig. 3.5.

In DFDT, there is a partial ordering relationship between the instance set of upper-level nodes and the directly lower nodes. In DFBDT, the instance set contained in Attribute B = VB and its two sub-nodes is {{1, 4, 5, 6, 7, 8, 9, 10, 11}, {1, 4, 6, 8, 9, 10}, {5,7, 11}}. The elements not only meet the partial order relationship, but also the union of {1, 4, 6, 8, 9, 10} and {5, 7, 11} also equals {1, 4, 5, 6, 7, 8, 9, 10, 11}. This obeys the definition of a semi-lattice. As with DFDT, the branch attribute also divides the entire set of instances. To study the partition of the object domain, a partition lattice has been proposed [32] and was introduced to a decision tree to study the partitioning of the instance set.

A DFDT can be denoted as

(T,T)=(U,U),(C,C),(D,D),(V,V),(F,F),

where (U,U) denotes the instance set, (C,C)(D,D)=(A,A) is the attribute set, subsets (C,C)and(D,D) are the condition attribute and decision attribute, respec tively, and (V,V)=(Vr,Vr) denotes the range of attribute.

Fig. 3.5: DFBDT example.

(A,A).(F,F):(U,U)×(A,A)(V,V) is a mapping that specifies the attribute values for each object in instance set (U,U),and(F,F)[0,1]×[,].

Definition 3.2 Given a finite universe U, π = {Xi|1 ≤ im} is a division of U if π satisfies the following conditions [33]:

Xi is not empty;

Xi, ij;

We can define a partial order relationship to represent the size between partitions. Suppose π1, π2 are two divisions of U. If π1 ≤ π2, any partition of π1 is contained in a certain partition of π2. That is, π1 ≤ π2 if and only if, π1≤ π2 such that Xi ∈ π1 π1 Xj

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.137.37