Enrichment of colorectal cancer associations in functional regions: insight for combining ENCODE and Roadmap Epigenomics data in the analysis of whole genome sequencing-imputed GWAS. S. Rosse1, P. Auer2,1, T. Harriason1, C. Carlson1, C. Qu1, G. R. Abecasis3, S. I. Berndt4, S. Bézieau5, H. Brenner6, G. Casey7, A. T. Chan8, J. Chang-Claude9, S. Chen3, S. Jiao1, C. M. Hutter10, L. Le Marchand11, S. M. Leal12, P. A. Newcomb1, M. L. Slattery13, J. Smith14, E. White1, B. W. Zanke15, U. Peters1, D. A. Nickerson14, A. Kundaje16, 17, L. Hsu1 1) Public Health Genetics, Fred Hutchinson Cancer Research Center, Seattle, WA; 2) School of Public Health, University of Wisconsin, Milwaukee, WI; 3) Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan; 4) Division of Cancer Epidemiology and Genetics, NCI, Bethesda, MD; 5) Service de Génétique Médicale, CHU Nantes, Nantes, France; 6) Division of Clinical Epidemiology and Aging Research, German Cancer Consortium (DKTK), Heidelberg, Germany; 7) Department of Preventive Medicine,University of Southern California, Keck School of Medicine, Los Angeles, CA; 8) Division of Gastroenterology,Massachusetts General Hospital and Harvard Medical School, Boston, MA; 9) Division of Cancer Epidemiology,German Cancer Research Center, Heidelberg, Germany; 10) Division of Cancer Control and Population Sciences, NCI, Bethesda, MD; 11) Epidemiology, University of Hawaii Cancer Center, Honolulu, HI; 12) Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX; 13) Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, UT; 14) Department of Genome Sciences and Howard Hughes Medical Institute, University of Washington, Seattle, WA; 15) Division of Hematology, Faculty of Medicine, The University of Ottawa, Ottawa, ON; 16) Department of Genetics, Stanford University, Stanford, CA; 17) Department of Computer Science, Stanford University, Stanford, CA.

   To investigate the role of low frequency genetic variation in colorectal cancer (CRC) susceptibility, the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) and the Colorectal Cancer Family Registry (CCFR) conducted genome-wide association studies (GWAS) of 12,661 CRC cases and 14,361 controls with imputation using a whole-genome sequence (6x coverage) reference panel of 610 CRC cases and 309 controls. These data provide a unique opportunity to investigate less-frequent and rare variants with minor allele frequencies (MAF) 0.1-5%, which contribute to the majority of the variation in the genome. However, as statistical power is limited for detection of rare variant associations we aim to address this limitation by incorporating functional data. To evaluate whether these data would be useful in discovering novel rare variant association we used ENCODE and Roadmap Epigenomics Project data to define hypothesized enhancer, promoter, and other regulatory regions in colon and rectal tissue and assessed whether these regions were enriched in GECCO-CCFR GWAS results for both aggregate-rare-variant and common-single-variant association tests. To define functional elements across the genome, we used chromatin structure across cell types and tissues to improve the resolution of CRC epigenetic signatures (uniform ChIP-seq signals). We then identified sets of rare variants and single common variants that overlapped with predicted regulatory regions to investigate enrichment of association p-values relative to those in non-CRC-regulatory regions across the genome using the Kolmogorov-Smirnov test. We found significant enrichment in regions that overlapped with predicted CRC regulatory regions for both rare variant (p=0.0001) and common variants (p=2.69e-27). In addition, we found that regulatory prediction using high resolution ChIP-seq data corresponded with in vitro identification of three previously identified CRC GWAS loci (MYC- rs6983267, CDH1-rs16260, and COLCA1-rs7130173). These results suggest that functional insight of cell-type specific regulatory mechanisms can inform the discovery of genetic associations and may be useful for incorporation into future association testing.

You may contact the first author (during and after the meeting) at