Partitioning heritability by functional category using summary statistics. H. Finucane1,2,6, B. Bulik-Sullivan3,9,11, A. Gusev1,2, G. Trynka3,7, P. Loh1,2, H. Xu2,8, C. Zang2,8, S. Ripke3.9, S. Purcell3,4,5,9, M. Daly3.9, E. Stahl3,4, S. Raychaudhuri3,7, S. Lindstrom1, N. Patterson3,10, B. Neale3,9, A. Price1,2,3, Schizophrenia Working Group of the Psychiatric Genetics Consortium 1) Dept of Epidemiology, Harvard School of Public Health, Boston, MA; 2) Dept of Biostatistics, Harvard School of Public Health, Boston, MA; 3) Broad Institute of MIT and Harvard, Cambridge, MA; 4) Dept of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY; 5) Dept of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY; 6) Dept of Mathematics, Massachusetts Institute of Technology, Cambridge, MA; 7) Division of Genetics and Rheumatology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; 8) Dana Farber Cancer Institute, Boston, MA; 9) Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA; 10) Department of Genetics, Harvard Medical School, Boston, MA; 11) Center for Neurogenomics and Cognitive Research, VU University Amsterdam, Amsterdam, The Netherlands.
Recent work has demonstrated that some functional categories of the genome contribute disproportionately to trait heritability. Partitioning heritability is traditionally done using a variance components approach; however this approach is not feasible at large sample sizes, and there are many datasets for which only summary statistics are available. Here, we introduce a new method for partitioning heritability that requires only GWAS summary statistics and LD information from a reference panel. Our method takes advantage of the fact that, under standard assumptions about genetic architecture, the expected chi-square statistic at SNP i is linear in the LD Score of SNP i, defined as the sum over SNPs j of r^2(i,j). In previous work (Bulik-Sullivan et al. 2014 bioRxiv), we used this relationship to differentiate between chi-square statistic inflation due to sample structure, which affects the intercept of a regression of chi-square statistic on LD Score, and inflation due to polygenicity, which affects the slope. Here, we use a multivariate regression of chi-square statistic on LD Scores specific to functional categories to partition heritability. Our method is robust to multiple causal variants at a locus, and obtains accurate estimates in simulations. On real WTCCC data across seven diseases it obtains results similar to the variance components approach and in a meta-analysis of these traits, infers significant enrichments for DNaseI Hypersensitivity Sites (DHS), histone marks and other functional categories. FANTOM5 enhancers (Andersson et al. 2014 Nature) were the most enriched, with 0.4% of the genome explaining 11.4% (s.e. 2.9%) of heritability (30x enrichment; P=7e-5). We applied the method to summary statistics from a schizophrenia (SCZ) dataset with 70,100 samples, and from a study of type two diabetes (T2D) with 69,033 samples. We found significant enrichment for many functional categories for both diseases; estimated enrichments tended to be much larger for T2D than for SCZ. Additionally, SNPs in fetal DHS regions were 4x enriched over SNPs in non-fetal DHS regions for SCZ (p =0.001), but not for T2D, which had a non-significant trend in the opposite direction (p = 0.058). Of the ten most enriched cell types for SCZ, three were brain cell types, and CD34 mobilized primary cells and embryonic stem cells were among the significantly enriched.
You may contact the first author (during and after the meeting) at