Functional enrichment analysis on genome-wide epistasis patterns reveals pathway interactions in Bipolar Disorder. S. Prabhu, N. Clinger, B. Burnett, I. Pe'er Computer Science, Columbia University, New York, NY.
In the context of GWAS, pathway based functional enrichment (FE) methods aggregate marker associations (like SNPs) at functionally annotated loci (like genes), and have been extremely useful at finding biological modules driving disease. So far, mainstream FE approaches have focused on mining single-locus GWAS hits (i.e. marginal associations), and developing statistics that implicate a single ontology, a single pathway, or a single set of genes with phenotype. In this work, we present the first systematic and unbiased framework for genome-wide FE of interactions (i.e. epistatic associations) in case-control GWAS datasets. First, we develop a statistical framework in which SNP-SNP epistasis p-values are percolated upwards into gene-gene scores. In the subsequent step, we combine gene-gene epistasis scores across gene sets, in a search for ontology-ontology epistasis. At each stage, we either use powerful statistics that have been thoroughly characterized for type I and II errors, or rely on an extensive (and so far impractical) permutation procedure to establish empirical significance. We also describe how to account for any confounding that might arise from variable gene size, ontology size, population structure and LD between loci. Finally, we apply our method to the WTCCC Bipolar Disorder (BD) dataset (2K cases, 3K controls, 113K genic SNPs). Considering the computational burden of even a single genome-wide interaction scan, our robust permutation analysis (5000 genome-wide scans for interaction in 1 week on a small compute cluster) was only made feasible by recent computational advances by our group (SIXPAC ultrafast interaction scan, Genome Research 2012). For BD, we report an ontology interaction graph containing 17 connected components (FDR<0.01). The largest component highlights the epistatic links across, but not within, two important gene sets. The first set contains genes involved in serine/threonine kinase activity and SMAD signaling proteins: targets of Lithium, the most widely prescribed compound for BD Disorder. The second set (LDL and HDL remodeling, triglyceride lipase activities and cholesterol transport) highlights the role of lipid levels, whose role in the pathophysiology of this disorder is less well understood. Other connected components highlight epistasis involving cerebellar purkinje cell development (implicated in schizophrenia and depression) and histamine and glutamate neurotransmitter catabolism, among others.