Whole genome sequencing of 2,850 Central-Northern European type 2 diabetes cases and controls reveals insights into functional mechanisms underlying disease pathogenesis. K. Gaulton1, J. Flannick2, C. Fuchsberger3, H. M. Kang3, N. Burtt2, J. Ferrer4, M. Stitzel5, M. Kellis6, M. McCarthy1, D. Altshuler2, M. Boehnke3, the GoT2D consortium 1) Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford UK; 2) Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge MA; 3) Center for Statistical Genetics, University of Michigan, Ann Arbor MI; 4) Imperial College London, London UK; 5) National Human Genome Research Institute, Bethesda MD; 6) Department of Computer Science, MIT, Cambridge MA.

   Type 2 diabetes (T2D) is a complex disorder with incompletely known etiology that affects millions of individuals worldwide. To further our understanding of the genetic factors and biological processes underlying T2D pathogenesis, we carried out whole genome sequencing of 2,850 T2D cases and controls of Central and Northern European origin as part of the GoT2D project. Low-pass (~4x) sequencing identified ~25M single-nucleotide variants, including >99% with minor allele frequency >0.1% in the sequenced individuals, allowing near complete evaluation of the contribution of variants in this allele frequency range to T2D risk. Subsequent imputation of variant genotypes into ~35K GWAS samples revealed novel loci harboring common variants approaching genome-wide significant association to T2D (P<1x10-7: HORMAD2, HSD17B12, CENPW) and putative lower frequency (MAF<.05) secondary signals at four known loci TCF7L2, CCND2, KCNQ1, and CDKAL1 (all P<1x10-6). Using this unbiased survey of variation, we then assessed to what extent broad classes of functional elements contribute to T2D using regulatory state and transcription factor binding maps from pancreatic islets, adipocytes, and nine ENCODE cell types. Variants overlapping sets of functional elements were tested for enriched association to T2D compared to sets of control variants (matched on genomic properties or in shuffled sites). Common associated variants were collectively enriched at enhancer elements (P = .005), and low-frequency associated variants at promoter elements (P = .004). We found heterogeneity across cell types whereby common variants are most prominently enriched at enhancers active in hepatocytes, adipocytes, and pancreatic islets and bound by specific factors active in these cell types such as NKX2.2, MAFB, and TCF7L2 (all P<.05). Patterns were unchanged when removing variants within 500kb of a known GWAs signal, demonstrating that enriched element types can prioritize novel susceptibility loci not strictly genome wide significant. These results suggest information about the non-coding genome can provide significant insight into the genetic and biological basis of T2D, and support the central importance of global regulatory mechanisms in specific cell types to disease pathogenesis. More broadly this study confirms whole genome sequencing as a valuable tool to dissect genetic factors and functional mechanisms contributing to complex disease.

You may contact the first author (during and after the meeting) at