Gene pathway burden test application to cardiovascular disease using whole genome sequencing data. M.AA. Almeida1, J. P. Peralta2, J. W. Kent1, T. M. Teslovich3, G. Jun3, C. Fuchsberger3, A. R. Wood4, A. Manning5, T. M. Frayling4, P. Cingolani6, D. M. Lehman7, T. D. Dyer1, G. Abecasis3, L. Almasy1, R. Duggirala1, J. Blangero1 1) Texas Biomedical Research Institute, Genetics Department, 7620 NW Loop 410, San Antonio, TX., USA; 2) Centre for Genetic Epidemiology and Biostatistics, University of Western Australia, 35 Stirling Highway Crawley, WA, Australia; 3) University of Michigan, Ann Arbor, MI, USA; 4) University of Exeter, Exeter, The Queen's Dr Exeter, United Kingdom; 5) Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA; 6) McGill University, 845 Sherbrook Street West Montréal, Canadá; 7) University of Texas Health Center at San Antonio, 7703 Floyd Curl Dr , San Antonio, TX, USA.

   The advent of whole genome sequencing provides many opportunities for understanding the source of causal variation underlying complex diseases. As part of the T2D-GENES Consortium, we have directly sequenced 590 individuals (and accurately imputed another 448 members) from 20 large Mexican American pedigrees to try to understand the role of rare and private variants in type 2 diabetes (T2D). Those individuals are part of the SAFS (San Antonio Family Study) and many phenotypes have been measured on these individuals. We have observed ~22 million single nucleotide variants (SNVs) in this sample. Such a large amount of data imposes statistical and analytical barriers that require the development of alternative approaches that allow a fast and sensible screening of potential causal genes and pathways. While gene-centric testing is now common, less effort has been placed to date on formal tests of the contribution of sequence variation in gene pathways to complex disease risk. We have developed a single degree-of-freedom test using a random effect model based on an empirical pathway-specific genetic relationship matrix (GRM) as the focal covariance kernel. The empirical pathway-specific GRM (the PSGRM) utilizes all variants (or a chosen likely functional subset) identified in gene members of a given biological pathway and is tested by the use of a LRT (Likelihood Ratio Test). Gene pathway definitions were obtained from the latest KEGG database release and a PSGRM was estimated for each pathway. Those pair-wise relationship definitions were tested using cardiovascular disease (CVD, defined as ECG-derived evidence of myocardial infarction, history of surgery related to atherosclerosis, and CVD-related mortality) as our focal trait. The glycerolipid metabolism pathway exhibited a significant association (p = 0.00087) and absorbed most of the observed CVDs heritability in this sample. This gene pathway is composed by 50 genes and a set of 43,819 SNVs were employed in the PSGRM calculation. Non-synonymous variants in this pathway that were predicted to be highly deleterious (PolyPhen-2 score > 0.8) were independently tested and a promising association with CVD was detected in the gene ALDH7A1, a gene that has previously been associated with lipid variation in another independent study. Our results suggest that our simple pathway-based test may be useful for reducing the search space for specific functional variants influencing complex phenotypes.

You may contact the first author (during and after the meeting) at