Using enhancer activity regulatory motifs to explore evolutionary trajectories and disease mechanisms. L. D. Ward1,2, W. Meuleman1,2, P. Kheradpour1,2, A. Kundaje1,2, M. Kellis1,2, Roadmap Epigenomics Mapping Consortium 1) CSAIL, Massachusetts Institute of Technology, Cambridge, MA; 2) Broad Institute of MIT and Harvard, Cambridge, MA.

   The broad range of cell types being studied by the Roadmap Epigenome Mapping Consortium allow us to apply techniques we have previously developed to (a) explore the relative strength of ancient and recent selection on the regulatory programs in different cell types; (b) discover regulatory motifs associated with the tissue specificity of enhancers; and (c) to generate hypotheses about the regulatory mechanisms underlying disease. We used k-means clustering to classify distal enhancers by their cross-tissue activity. These clusters were more enriched for measures of inter- and intra-species sequence constraint than single-tissue annotations; the most mammalian-conserved being a cluster of enhancers strongest in fetal brain, lung, and kidney, and adult brain (14.2% conserved) and a cluster of constitutively poised enhancers active in pancreas, spleen, and gastric tissue (13.9% conserved.) We then looked at derived allele frequencies of SNPs in the 1000 Genomes Phase 1 data to infer the strength of negative selection on the human lineage, improving on our previous methodology to account for ascertainment biases due to read depth. Human constraint was strongest at a cluster of ubiquitous enhancers located near housekeeping genes and a cluster of enhancers active in mesenchymal-derived cells with skin-related gene enrichment. Interestingly, a high derived allele frequency was associated with enhancers with a poised signature in fibroblasts and skeletal muscle and active in fetal and adult brain, near genes annotated as regulating interneuron differentiation and neural tube patterning, suggesting a developmental program targeted by recent positive selection. We then used enrichment analysis to discover putative regulatory motifs that distinguish enhancers belonging to activity clusters. These regulatory motifs allow us to (a) pinpoint signals of selection within enhancers and (b) better dissect disease-associated haplotypes to develop hypotheses about causal motif-altering variants. We have incorporated these maps of "driver" motif instances into the latest version of our online genome annotation tool HaploReg, which allows sets of haplotype blocks from genetic studies to be visualized directly with ENCODE and Roadmap regulatory elements, results from other GWAS and eQTL/meQTL studies, and disruption and creation of regulatory motifs. HaploReg also performs systems-level enrichment analyses on GWAS against these regulatory features.

You may contact the first author (during and after the meeting) at