Combining regulatory domain and genetic variation information to identify cell types, regulatory elements, and causal genetic variants that influence human disease. E. Schmidt1, J. Chen1, C. Willer1,2, Metabochip GIANT-BMI and ICBP 1) Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI; 2) Cardiovascular Medicine, University of Michigan, Ann Arbor, MI.
Examining trait-associated variants from genome wide association studies relative to genomic regions of functional importance can give insight into the molecular mechanisms leading to disease phenotypes. We hypothesize that there are particular cell types in which trait-associated variants impact transcriptional regulation and investigate whether these variants are enriched in regulatory domains in relevant cell types. In addition, we aim to prioritize GWAS variants or their proxies as functional candidates based on overlap with regulatory domains.
We evaluate enrichment of GWAS SNPs for blood pressure, body mass index, coronary artery disease, lipids, and type 2 diabetes. For each trait, we group variants in LD with trait-associated index SNPs and determine overlap of these SNPs with DNase hypersensitivity sites from 213 cell types obtained from the ENCODE and Roadmap Epigenomics Projects. BED files containing regions of chromatin accessibility identified by DNase-seq are used to identify overlap. We compare the observed overlap with permuted sets of SNPs from the 1000 Genomes data, which match index SNPs for: i) number of SNPs in high LD (r2>0.7), ii) exact minor allele frequency, and iii) exact distance to nearest gene. For cell types in which we see significant enrichment in DNase HS sites, we further investigate enrichment using functional elements such as histone methylation marks, FAIRE and ChIP-seq TF binding from ENCODE as well as functional chromatin states defined by Ernst 2011. Lastly, we annotate individual SNP overlap with significant regulatory marks examined above as well as expression quantitative trait loci in relevant tissues.
We find evidence of significant enrichment in DNase HS sites for each set of trait-associated variants tested: lipid variants in leukemia cells (P=2x10-14); BMI variants in olfactory neurosphere-derived cells (P=2x10-5); BP variants in osteoblasts (P=2x10-7); CAD variants in hepatocellular carcinoma cells (P=6x10-7); and T2D variants in colorectal carcinoma cells (P=1x10-13). Our method can be applied to other traits to: i) characterize cell type where GWAS variants may exert their effect; ii) identify regulatory elements or TFs that may be impacted; and iii) fine-map or prioritize functional variants at specific loci.
You may contact the first author (during and after the meeting) at