9.1 Introduction

Alzheimer's disease (AD) is a progressive neurodegenerative disorder characterized by amyloid beta plaques and neurofibrillary tangles. It is not only the principal cause of dementia in the United States but also one of the fastest growing diseases in the developed countries [1]. Currently, there are over 4 million Americans diagnosed with AD and the number is estimated to double during the next 25 years [1]. In an AD brain, the cortex shrivels damaging the areas of cognition, planning, and memory [2]. Further, the hippocampus of the cortex shrinks hindering formation of new memory. The molecular mechanisms underlying the pathologies of AD are being uncovered [3,4]. Thus far, familial AD (<60 years old) which accounts for less than 5% of AD cases is due to mutations in amyloid precursor protein (APP), presenilin-1 and presenilin-2, whereas sporadic AD (>65 years old) which accounts for more than 95% cases of AD, is genetically linked to apolipoprotein E isoform 4. Despite these recent progresses in the characterization of the pathologies of AD, existing treatments for AD are far from satisfactory [5]. A more comprehensive understanding of the molecular mechanisms underlying AD is needed for better identification of molecular targets as well as development of more effective therapeutics.

In recent years, network-based methods have been widely applied to identify biomarkers or targets for various diseases [6], typically by integrating gene expression data and available physical interaction data (i.e., protein–protein interaction). Studies have shown that disease genes oftentimes cooperate with each other within the same biological modules [7], that is, signaling pathways, regulatory modules, protein complexes, or protein interaction subnetworks, suggesting a strong association or interaction between genes/proteins in rendering the disease. Integrating either physical or functional interaction data into a network has become a powerful tool for identifying novel genes involved in complex diseases, that is, cerebral ataxias [8], breast cancer [9], glioblastoma [10], and so on. In the case of AD, protein interaction data specific to AD have accumulated in the literature [11], generated through high-throughput experiments [12]. Concomitantly, computational analyses have shown that the integration of gene expression data with physical and functional interaction data can be useful in characterizing the pathways and the cross-talk [13,14] or in prioritizing candidate genes [15–17] involved in AD.

A majority of the existing network-based studies of diseases integrate a global protein–protein interaction (PPI) network with the gene expression data [18], whereby the genes in the microarray datasets are mapped to the proteins in the network. Then differential analysis (i.e., Student's t-test) and correlation analysis (i.e., Pearson correlation [19] or mutual information [20]) between the genes are applied to identify disease-specific protein networks, a subset of the global PPI network. For example, in a recent network-based AD study, only the genes that were differentially expressed and had high correlations and physical interactions remained in the final network [13]. However, the applications of correlation analysis for identifying a phenotype-specific PPI network, thus far, have not directly incorporated the phenotype into the analysis. Instead, typically a network of gene pairs (highly correlated and physically interacting for each of the conditions) is built and compared to identify the interactions that are specific to a condition or phenotype. Consequently, these methods become sensitive to the quality of data. Since the noise level and the size of the samples can affect the correlation calculation for each of the conditions, it is difficult to determine whether the differences in the networks across conditions are real changes that provide insight into the mechanisms or simply an artifact due to the size or noise levels.

Alternatively, some computational approaches have been developed to select sets of gene pairs relevant to a phenotype based on classification models, such as support vector machine [21,22], decision tree [23], and probabilistic model [24]. Intuitively, if a phenotype prediction based on a pair of genes performs better than that based on either one of the genes then the pair of genes is suggested to have cooperative effects on the phenotype. However, such classification methods fail to distinguish the cooperative effects of the genes pairs from the independent contributions of the individual genes [25]. To address this drawback, we present an information theoretical method that distinguishes the difference between the cooperative versus individual effects of the genes.

Molecular machinery in complex diseases usually involves multiple factors, many of which function cooperatively, that is, synergistically. Indeed, the factors or pathways that function synergistically in the development or pathophysiologyy of AD need to be characterized [26,27]. Some studies have uncovered the genes that work synergistically to increase the risk of AD progression, that is, APOE4 and BCHE-K [28], or HO-1 and tau genetic variants [29]. A method that can identify these synergistic interactions between genes would provide information complementary to existing approaches (i.e., correlation analysis), and help to enhance our understanding of the complex mechanisms underlying AD. Computational approaches for systematic assessment of synergy were first proposed in neuroscience, where the goal was to understand the neuron code by evaluating the strength of correlations between the neurons upon activation by stimuli [30,31]. More recently the concept of information synergy has been applied to the field of systems biology [32–34]. Investigators developed an information theoretical measure of synergy from discretized gene expression data, and applied this measure to identify cooperative gene interactions associated with neural interconnectivity [32] and prostate cancer development [33]. More recently, the concept of synergy and the information theoretical measure of synergy have been applied directly to continuous gene expression data [25].

We adopt this concept of information synergy to evaluate the synergistic effects of two genes on a phenotype (AD in this case). For two genes in a multivariate system, their synergistic effect on a phenotype is defined as the gain in the “mutual information” over the sum of the information provided by each gene on a phenotype. A positive synergy denotes that two genes regulate a phenotype, either cooperatively (e.g., coactivating) or antagonistically (e.g., competitive inhibiting). Thus, one can predict the phenotype from either of the two genes at a certain confidence level, whereas knowing both genes brings additional information, which further enhances the confidence of the prediction. Negative synergy on the other hand denotes redundancy, thus knowing both genes brings redundant information to the prediction. Zero synergy denotes that at least one of the two genes does not affect the phenotype, and therefore brings neither additional nor redundant information to the prediction of the phenotype.

In this chapter, we introduce an integrative methodology to build a protein network based on information synergy analysis that is specific to AD. First, we collected a publicly available microarray dataset for AD and mapped to a global network collected from experimental PPI databases. Next, we assess the synergistic effects between the genes that are mapped onto the PPI network. Unlike other computational methods used to identify gene interactions, the fundamental concept of synergy is to identify the cooperative gene interactions responsible for the phenotype. Finally, with the identified synergistic gene pairs, a synergy network is built which is a subset of the global PPI network. Topological analyses reveal the structural characteristics of the network while the hub genes provide insights into potential mechanism(s) involved in the induction of the phenotype. Further, a comparison with differential expression or differential correlation analyses indicates that the information synergy approach could provide complementary information to these traditional approaches.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.168.203