As mentioned above, several remarkable structural properties have thus far been characterized in biological networks via network analysis.
A well-known feature of networks is heterogeneous connectivity, which indicates that the node degree k (the number of edges that the node has) approximately follows a power–law distribution: P(k) ∝ k−γ. The exponent γ, called the “degree exponent,” is empirically known to be between 2 and 3. This power–law degree distribution implies that a few nodes (hubs) integrate numerous nodes while most of the remaining nodes do not; this feature is clearly different from that of the Poisson (homogeneous) distribution obtained from classical (ER) random networks. To reproduce heterogeneous connectivity, the BA model was proposed. This model has two mechanisms: the growth mechanism, in which an added node connects to existing nodes, and the preferential attachment (PA) mechanism, in which the existing node i is selected with the probability Πi = ki/∑ jkj, where ki is the degree (the number of edges) of node i. The BA model generates heterogeneous networks with the power–law degree distribution P(k) ∝ k−3 [12].
The other structural property of networks is hierarchical modularity. Most real-world networks are clustered; this is characterized by the clustering coefficient C. The clustering coefficient of node i (Ci) is defined as 2Mi/[ki(ki − 1)], where Mi is the number of edges among the neighboring nodes of node i. In real-world networks, there exists a power–law relationship between Ci and ki with the exponent −1, that is, ; the ER model and the BA model do not represent this relationship because the network is randomly constructed with a given degree sequence. This power–law relationship is reproduced by the hierarchical organization of modules (small dense networks) [13]; thus, it is referred to as hierarchical modularity. Furthermore, the triad formation [14] and the merging module [15] also reproduce this property. Although there are several models, the most important aspect is the proportional relationship between the degree of node i and the number of modules including node i (the number of edges among the neighbors of node i).
In addition, disassortativity [16] is also a well-known property. The correlation of the degree between a node and its neighboring node (hereafter referred to as “degree correlation”) is useful for explaining disassortativity. Disassortativity indicates negative degree correlations. The ER model and the BA model indicate no such correlation because their networks are randomly constructed. In such a random network with a given degree sequence, the probability of transition from nodes with degree k to nodes with degree k ', P(k ' |k), is k ' P(k ')N/(2E) [17]. Thus, the expected degree of neighboring nodes is ∑k'k ' P(k ' |k) =k2/k, where denotes the average over all nodes. This result indicates that the degree correlation is independent of degree k (i.e., there is no correlation).
This is particularly caused in the BA model due to the preferential attachment in which a node (node i) gets new edges with the probability Πi that is proportional to its own degree, that is, Πi = ki/∑ jkj. Due to this preferential attachment, the transition probability is almost similar to the above equation.
We cannot neglect preferential attachment because this mechanism provides heterogeneous connectivity. To solve this problem, we need to consider information other than the degree. This information is based on studies on competition dynamics [18] and weighted networks [19,20], which discuss the effects of fitness on a network structure. Here, we also refer to this information as fitness. By considering this information, the networks have negative degree correlations [21,22].
As mentioned above, these different structural properties are explained by different models; thus, the theory for network evolution is complicated. For a simple understanding of this theory, we need to construct a new convenient model that includes several models. In this section, we introduce a simple unified model for reproducing heterogeneous connectivity, hierarchical modularity, and disassortativity.
Our model includes growth by merging modules and the fitness-driven (FD) mechanism (see Ref. 23 for biological implications of this model).
Taken together, our model networks are generated by the following procedures.
When a > m and ξ ≥ 0, the model network evolves in time steps. In addition, our model is equivalent to the BA model with the specific condition: a = 2, m = 1, and ξ = 0.
In this section, we present the analytical and numerical solutions for the degree distribution of our model.
In order to describe the degree distribution, we employ the mean-field-based analysis [12,22]. The standard approach cannot be directly applied due to the inclusion of the fitness updating. We express the degree and fitness as
(3.3)
and investigate the time evolution of Fi. Since the fitness updating rule is represented as fi ← fi + ξ, Fi satisfies
indicating the proportional relationship between Fi and ki.
The time evolution of Fi is described as
where ∑jFj ≈ [a(a − 1) + mξ]t. The solution of the equation with as an initial condition for Equation 3.5 is
where β = [m(a − 1 + ξ)]/[a(a − 1) + mξ]. Because s/t denotes the probability that Fi is larger than a given F, Equation 3.6 is rewritten as
which corresponds to a cumulative probability. From Equation 3.7, the probability distribution for F is given as
where γ = (1/β) + 1. Finally, substituting Equation 3.4 into Equation 3.8, we obtain the degree distribution
(3.9)
where B(a, ξ, γ) = A(a, ξ)γ−1[ξ/(a − 1) + 1]−γ/(γ − 1). The degree distribution obeys the power law with the degree exponent
demonstrating that our model network contains the scale-free feature.
In order to confirm the analytical predictions, we performed numerical simulations of networks, generated using our model. Figure 3.3 shows degree distributions with a = 5, N = 50, 000, and different values of m [1, 4] and ξ [0, 15]. These degree distributions follow the power law, reflecting the scale-free feature.
Figure 3.3a shows the degree distributions with m = 2 and different values of ξ. The degree exponents decay with increasing ξ. Figure 3.3b shows the degree distributions with ξ = 7 and different values of m. The degree exponents decay with increasing m. The numerical results and theoretical predictions are in excellent agreement, demonstrating that γ is a function of m/a and ξ.
In this section, we show the analytical and numerical solutions of the degree-dependent clustering coefficient.
First, we provide the analytical solution for the degree-dependent clustering coefficient of our model. Since our model grows due to the merging of modules (see Fig. 3.2), the number of edges among neighbors of node i is approximately described [15] as
where Si corresponds to the number of selections of node i with the PA, as in Equation 3.1. In our model, because the degree of node i is expressed as ki = Si(a − 1), Equation 3.11 is rewritten as
indicating the proportional relationship between Mi and ki. Finally, substituting Equation (3.12) into the definition of the clustering coefficient (i.e., Ci = 2Mi/[ki(ki − 1)]), we obtain the degree-dependent clustering coefficient
(3.13)
The degree-dependent clustering coefficient follows the power law with the exponent −1 when a ≥ 3, reflecting the hierarchical modularity of our model network.
Next, we present the numerical results of the degree-dependent clustering coefficient of our model in order to verify the analytical solution. In Figure 3.4, we show the degree-dependent clustering coefficient with a = 5, N = 50, 000, and different values of m [1, 4] and ξ [0, 15].
Figure 3.4a shows the degree-dependent clustering coefficient with m = 2 and different values of ξ. C(k) follows the power law with the exponent −1 despite the varying ξ values. Figure 3.4b shows degree-dependent clustering with ξ = 7 and different m values. Moreover, C(k) follows the power law with the exponent −1 for a large value of k. For a small value of k, the cut-off becomes more prominent with increasing m as it moves further away from the approximation of Equation 3.11.
The degree-dependent clustering coefficient follows the power law with the exponent more or less equal to −1, reflecting the hierarchical modularity of networks.
To examine the parameter-dependency of the modularity (clustering property) for the entire network, we employ an average clustering coefficient C. The coefficient is defined as . A high value of C implies that the network has high modularity.
Figure 3.5 shows the numerical result of C in our model network with a = 5, N = 12, 800, and different values of m [1, 4] and ξ [0, 15]. C tends to increase by a greater extent for smaller m/a and larger ξ. Since the small and large m/a lead to different effects on C, Figure 3.5a shows a valley in the middle of m/a. In the case of small m/a, C can remain high because the modules combine via a few common nodes. In contrast, in the case of large m/a, C is low because the networks tend to be randomized, which is due to the fact that the modules combine via many common nodes. In this case, however, the FD mechanism helps to increase the fitness of the nodes in the modules, inducing the formation of a cluster with high-edge density. As a result, C increases with m/a. Due to the trade-off mechanism, a valley-shaped curve is obtained. In addition, C is less sensitive for large ξ.
The degree correlation is a structural property that characterizes assortativity of the networks and represents the average degree of the neighbors of nodes with degree k. This correlation is defined as
(3.14)
where δ(x) is Kronecker's delta function. This function returns 1 when x = 0 and returns 0 otherwise. Γi denotes the average nearest-neighbor degree and is expressed as
(3.15)
where V(i) corresponds to the set of neighbors of node i.
Here, we provide the numerical solutions for the degree correlation of our model. Figure 3.6 shows the degree–degree correlations with a = 5, N = 50, 000, and different values of m and ξ. The correlations follow the power law; with −1 < ν < 0 as a rough observation. Negative and larger values of ν are observed for larger m/a and ξ.
Figure 3.6a shows the degree correlations with m = 2 and different ξ values. The exponent ν decays with increasing ξ. Figure 3.6b shows the degree–degree correlations with ξ = 7 and different m values. The exponent ν decays with increasing m.
Due to the fitness updating and the PA mechanism, the degree correlations follow the power law, reflecting the disassortativity of the networks. As previously reported, disassortativity is not reproduced if we only consider the PA mechanism [15,21].
The assortative coefficient (AC) [16] can be considered to be a compendium parameter of the degree correlations and is defined as
(3.16)
where ki and kj are the degrees of two nodes at the ends of an edge and denotes the average over all edges. In other words, the AC is the correlation coefficient for the degree correlation and takes the values −1 ≤ r ≤ 1. The relationship between the AC and the network structures is described as follows:
Figure 3.7 shows the numerical solutions of the AC for our model with a = 5, N = 12, 800, and different values of m [1, 4] and ξ [0, 15]. For larger m/a and ξ values, the large negative AC is generally observed. For smaller ξ [0, 1], on the other hand, the valley-shaped curves (once, r takes positive instead of negative values) emerge because positive degree correlations are exhibited in the case of m/a ≤ 0.5 [15]. For larger ξ [3, 15], assortative coefficients r monotonically decrease (upward curve) with increasing m/a.
In order to validate our model, we compare the structural properties of biological networks with those of our model networks. We construct two different types of networks: the gene regulatory [24] and the metabolic [25] networks of E. coli. The gene regulatory network is represented as a set of graphs consisting of nodes and edges, which correspond to genes and the interactions among genes, respectively. For simplicity, we extract the largest component from the networks and replace the directed and/or weighted edges with undirected and/or unweighted edges. Moreover, we remove the multiple edges and self-loops. The metabolic network is transformed by the same procedures.
In Figure 3.8, we show the structural properties of the biological networks and our model networks. Our model is found to be in good agreement with the data of the real network, demonstrating that our model reproduces the three previously mentioned remarkable structural properties that are widely shared among biological networks.
18.191.233.43