9.2 Datasets and Methods

9.2.1 Microarray Dataset and Protein–Protein Interaction Data

The AD microarray dataset (GSE5281) in GEO database [35] was used for the analysis, including 13 control samples and 10 AD samples collected from the entorhinal cortex region in the brains of AD patients with a mean age of 79.8 ± 9.1 years. The intensities of the probe-sets were first normalized by robust microarray adjustment (RMA) and logarithmized to base two. The expressions of the multiple probes for the same genes on the microarray were then averaged. Experimental PPI data was collected from two major protein interaction databases for human, including BioGRID [36] and HPRD [37]. Duplicate and self-interactions were removed from the analysis.

9.2.2 Calculation of the Synergy Scores of Gene Pairs

An information theory-based score was calculated to quantify the synergy between the genes [34]. Given two genes, G1 and G2, and a phenotype P, the synergy score between G1 and G2 with respect to the phenotype P is defined as

equation

where I(G1;P) is the mutual information between G1 and P, I(G2;P) is the mutual information between G2 and P, and I(G1,G2;P) is the mutual information between (G1,G2) and P. This equation reflects the definition of synergy, the additional contribution provided by the “whole” as compared to the sum of the contributions of the individual “parts.” Mutual information (I) was calculated using a clustering-based method from continuous data [25].

The synergy scores range from −1 to 1. A positive synergy score indicate that two genes jointly provide additional information on the phenotype, a negative synergy score indicate that the two genes provide redundant information about the phenotype, and a zero score indicate that the two genes provide no additional information about the phenotype.

9.2.3 Permutation Test to Evaluate the Significance of the Synergy

A permutation test was performed to assess the statistical significance of the information synergy scores of the gene pairs. The phenotypes, that is, AD versus normal, were randomly shuffled to be uncorrelated with the gene expression profiles. The information synergy scores of the genes were then recalculated based on the shuffled phenotype. This process was repeated 100 times to estimate the distribution of random information synergy scores based on kernel-density approach, and the p values of the real information synergy scores were then calculated for each gene pair based on the distribution of random information synergy score. Finally, Benjamin–Hochberg false discovery rate procedure [38] was performed to adjust the p values for all the gene pairs and thereby control the expected false discoveries. The p value cutoff was set at 0.05.

9.2.4 Characterization of the Network Topology

Structural network theory and the characterization of network topology have contributed to our understanding of the architectures of networks [39]. Structural network analyses have revealed that existing biological networks, including gene regulatory networks, metabolic networks, signaling networks, and PPI networks, are very different from randomly organized networks. These biological networks have a “scale-free” feature that is characterized by few hub nodes that contain many connections and many nodes with very few connections [40]. Further, structural network analysis has contributed to our understanding of the functional organizations in biological systems. The hub nodes in scale-free networks have been shown to play essential roles in certain biological systems, for example, the essential proteins (critical for cell viability) are more significantly over-represented in the hubs than in the nonhub nodes in the yeast PPI network [41]. In addition to exploring the fundamental principles of biological networks, structural analyses of human metabolic networks have contributed insights into disease comorbidity [42].

In our analysis, the synergy network was built with gene pairs that have statistically significant synergy scores and physical interaction in PPI network. The network composed of nodes that represented the genes, and edges that represented the synergy of the gene pairs. Topological analysis of the networks obtained from information synergy, differential expression analysis, and differential correlation, was performed to reveal their topological characteristics, including the distribution of node degrees and the distribution of shortest path length. Degree distribution provides a distribution of the number of edges associated with the nodes. Shortest path length is the lowest number of edges that connect two nodes, and is measured using a breadth-fast search algorithm [43].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.226.120