6.5.1Basic concept of dynamic fuzzy graph

1. Relational concept of graph

Definition 6.13 Let G = (VG, EG) be a graph, where VG represents the finite nonempty set of data elements in the graph that are nodes and EG represents the set of relationships between two nodes, i.e. a set of edges.

Definition 6.14 For nodes v, w in a graph, if <(v, w>∈ EG, (v, w) represents a bow between an initial point v and an end point w, the graph is called a direct graph. If <v, w>∈EG and <w, v>∈ EG, then EG is symmetric. We can use the unordered pair (v, w) to replace the two ordered pairs in representing that an edge exists between v and w; this graph is called a non-direct graph. The predicate P(v, w) represents the information or meaning of the bow <v, w>.

Fig. 6.6: DFT of Quinlan’s playtennis example.
Fig. 6.7: Example of graph.

For example, G1 in Fig. 6.7(a) is a direct graph, and the predicate P(v, w) represents a one-way path from v to w:

G1=(V1,E1),

where V1={v1,v2,v3,v4,}andE1={<v1,v2>,<v1,v3>,<v3,v4>,<v4,v1>,}.

G2 in Fig 6.7(b) is a non-direct graph

G2=(V2,E2),

where V2 = {v1, v2, v3, v4, v5} and E2 = {(v1, v2), (v1, v4), (v2, v3), (v2, v5), (v3, v4), (v3, v5)}.

Definition 6.15 Assume there are two graphs G = (VG, EG) and G=(VG,EG). If VGVGandEGEG,thenG is a sub-graph of G, that is, G contains G.

Definition 6.16 In a non-direct graph, if there is a path from node v to node v, then v and v are connected. If vi and vj are connected ∀vi, vj∈VG, then G is a connected graph; otherwise, G is a disconnected graph.

Definition 6.17 If the two graphs G1 = (V1, E1) and G2 = (V2, E2) have the same topological structure, they are said to be isomorphic; that is, there exists a corresponding relationship from V1 to V2 such that each edge in E1 corresponds to only one edge in E2, and vice-versa. Sub-graph isomorphic means that there exist sub-graphs of G1 and G2 that are isomorphic, which is to judge whether G1 contains G2.

2. Dynamic fuzzy graph representation

Definition 6.18 Assume G = (VG, EG) is a general graph and that σ : V → [0, 1], μ E → [0, 1] satisfy μ(e) ≤ g(x) < g(y), where e = xy [48]. Define G_=(G,σ,μ) as an F-graph. If G is a simple graph, then G is a simple graph; if G is a direct graph, then Gis an F-direct graph.

Theorem 6.3 The sufficient and necessary condition for G = (G, σ, μ) to be an F-graph is that, ∀λ ∈ [0, 1], (σλ μλ) is a sub-graph of G, where (σλ = {x : σ(x) ≥ μ}, σλ = {e : μe) ≥ λ}.

Proof: If G is an F-graph, λμ(e)σ(x)Λσ(y),soxμλ,yμλ, that is, (μλ, μλ) is a sub-graph of G.

Conversely, if ∀λ ∈ [0, 1], (μλ, μλ) is a sub-graph of G, then eE,(σμ(e),μμ(e)) is a sub-graph of G.

Because all e=xyμμ(e),thenμ(e)σ(x)Λσ(x),, so G is an F-graph. This completes the proof.

Based on the basic concept of F-graphs, with the help of DFL to make appropriate changes, the points and edges represented by the DF formula transform an F-figure into a graph containing a DFL formula.

For example, if the range field of some F-graph G is U = {v1, v2, v3, v4}, the F-matrix is

B=[00.30.20.80.3000.50.200.30.70.80.50.70.7].

The node set of the F-graph is VG={0.9/v1,0.9/v2,0.6/v3,0.7/v4,},and G=(V,G) is shown in Fig. 6.8.

The DFL can be represented as VG={(0.9,0.9)/v1,(0.9,0.9)/v2,(0.6,0.6)/v3,(0.7,0.7)/v4}.

The above sample can be transformed into a direct graph, as shown in Fig. 6.9. The DFL can be described as VG={(0.9,0.9)/v1,(0.9,0.9)/v2,(0.6,0.6)/v3,(0.7,0.7)/v4}.

EG={0.2/(v1,v3),0.3/(v1,v2),0.8/(v1,v4),0.7/(v3,v4),,0.5/(v2,v4),0.3/(v3,v3),0.3/(v1,v2)}.. In Fig. 6.9, there are seven circles, that is,

{V1V2V4V1,V1V4V3V1,V2V4V1V2,V3V4V1V3,V4V1V3V4,V3V3,V4V1V2V4}.

Taking V1V2V4V1 as an example, consider

(0.9,0.9)V1(0.9,0.9)V2(0.7,0.7)V4(0.9,0.9)V1(0.9,0.9)V2(0.7,0.7)V40.3(V1,V2)0.5(V2,V4)0.8(V4,V1)0.3(V1,V2)0.5(V2,V4)0.8(V4,V1).

Fig. 6.8: Non-direct graph.
Fig. 6.9: Direct graph.

These situations will cause the results to change whenever an item in the formula changes.

More generally, in a direct circle, F-points and F-edges have the following DFL formulas:

(λ1,λ1)V1(λ2,λ2)V2(λn,λn)Vn(λ1,λ1)V1(λ2,λ2)V2(λn,λn)Vn(λ1,λ1)V1(λ2,λ2)V2(λn,λn)Vn(λ1,λ1)V1(λ2,λ2)V2(λn,λn)Vn(μ1,μ1)(V1,V1)(μ2,μ2)(V2,V2)(μn,μn)(Vn,Vn)(μ1,μ1)(V1,V1)(μ2,μ2)(V2,V2)(μN,μN)(Vn,Vn)(μ1,μ1)(V1,V1)(μ2,μ2)(V2,V2)(μN,μN)(Vn,Vn)(μ1,μ1)(V1,V1)(μ2,μ2)(V2,V2)(μn,μN)(Vn,VN)

These DFL formulas cause the results to change only if any item changes, a process that can be considered to have a learning action.

6.5.2Dynamic fuzzy graph hierarchical relationship learning algorithm

A dynamic fuzzy directed acyclic graph (hereafter referred to as DF) can represent the relationships between categories according to levels, namely, the dynamic fuzzy dependency relation among the characters. This has the following meaning:

(1)Each vertex represents a class, and the corresponding categories are not the same;

(2)Each directed edge represents dynamic fuzzy dependencies between classes, with arrows pointing toward the subclass and away from the parent class;

(3)There is only one vertex with an in-degree of 0, which represents the total class of all data at the highest level; a vertex with an out-degree of 0 denotes that the current category cannot be classified. See Fig. 6.10.

Theorem 6.4 There must be a topological order of nodes in the DF graph.

The topological order is the order of the vertices in the DF graph; i.e. the starting point of each directed edge in the DF graph is located at the front of the vertex pointed to by the edge in the ranking. Kahn developed an algorithm of complexity O(|V| + |E|) to find the topological order of a non-directed graph G = (V, E) [57]. As shown in Fig. 6.10, the application of this algorithm gives the topological order a b c d e f g h.

Fig. 6.10: Dynamic fuzzy directed acyclic graph that represents hierarchical structure.

The dynamic fuzzy graph is used to deal with uncertain information and the dependency relationship, which is more expressive and has good learning ability. The dynamic fuzzy membership function can be automatically generated and adjusted so as to be applied to complex processes and systems.

The dynamic fuzzy graph hierarchical relationship learning algorithm (DFGHR) is described in Algorithm 6.2. Here, x is an attribute feature, Y represents the class set, and ci represents the class corresponding to x. The training process considers the vertex vi (except the general category whose in-degree is 0) in the DF graph. Generate a two-element classifier Hci : X × YR, where X represents an attribute feature, Y represents the class set, and R represents the real numbers. Hci (x, y) ∈ [0, 1] × [←, →] represents the degree to which x belongs to y, that is, the membership rate. |Hci (x, y)| represents the confirmation of the judgment.

In the topological order of the DF graph, nodes corresponding to the parent class must be in front of nodes corresponding to sub-classes. Step 4 in the following DFGHR algorithm ensures the high-level classification occurs before the low-level classification. In two-element classification, the judgment that the attribute belongs to a class uses the DF membership rate (θ,θ) (Step 6 in the following DFGHR algorithm). When a classifier return determines that some feature attribute does not belong to the category, according to the idea of hierarchical classification, the feature attribute does not belong to a parent class, and it also does not belong to the subclass. In this case, we remove the transfer path. That is, when a category node in-degree is reduced to 0, the feature attribute does not belong to all parent classes, and the node should be removed from the topological order in T. The class corresponding to a vertex in T that has not been removed is the class that the feature attribute is waiting to classify (step (7)).

The time required to find the topological order in a DF graph is O (|V| + |E|), and the time complexity of the dynamic fuzzy graph hierarchical relationship learning algorithm is max {O(|V| + |E|), O(|V|T(Hci))}, where T(Hci) represents the time cost of Hci.

The original topological order of the vertex in Fig. 6.10 is a b c d e f g h. Through dynamic adjustment, we evaluate Hb(x)<(θ,θ); if so, then R(d) = R(e) = 0, R(f) = 1, R(f) = 1. Therefore, the topological order becomes a c f g h.

Algorithm 6.2 DFGHR

(1)The number of initial feature classes and the number of final classes determine the vertexes; initialize the DF figure as input;

(2)Find a topological order T in the DF graph;

(3)Calculate the in-degree R(vi) of the vertex vi;

(4)According to T, deal with the vertexes vi in the DF graph in turn;

(5)Call the classifier Hci of vertex vi (the corresponding class ci) to classify x, then obtain Hci (x);

(6)Compare Hci and (θ,θ) to determine whether to remove vi, and adjust the value of R(vi) = 0, remove the corresponding vertex from T, and adjust the in-degree of the adjacent vertex. Repeat until the in-degrees of all vertexes behind that vertex in T are not 0.

(7)Output the class c corresponding to the saved vertex in T, the DF membership rate, obtain the level division set Y of each attribute feature, and determine the hierarchical structure in the DF graph.

6.5.3Sample analysis

An example analysis using the Weka dataset (see Tab. 6.10) was conducted to compare the J48 algorithm with DFGHR. After processing the original data in the test sample set and the combined training sample/test sample set, the constructed DF diagram was tested by 10-fold cross-validation. The vertex numbers of the corresponding figure obtained by the J48 and DFGHR algorithms are presented in Tabs. 6.11 and 6.12.

The classification accuracy of J48 and DFGHR was compared after 10-fold cross-validation. The dynamic fuzzy graph hierarchical relation learning algorithm uses the dynamic fuzzy pretreatment method to deal with data, which fully takes the uncertainty in the real problem into account. The algorithm constructs a hierarchical relationship, which considers the contribution of each layer’s attribute features to the classification of instances and further improves the classification accuracy of the training example set. According to the experimental results, the classification accuracy of DFGHR indicates improved predictive ability over the J48 algorithm. For the datasets with missing information, its superiority is more obvious.

6.6Sample application and analysis

As a kind of mathematical model, graphs are an active field of study in relational data mining and have a wide range of applications, including protein analysis, compound analysis, link mining, XML document analysis, and the detection of security threats [57]. This section mainly verifies the algorithms presented in this chapter.

Tab. 6.10: Date set information owned by weka.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.74.25