Inferring and sequencing the founding bottleneck of Ashkenazim. I. Pe'er1, S. Carmi1, S. Mukherjee2, N. Parlamee3, M. Bowen4, K. Hui4, V. Joseph5, P. F. Palamara1, L. Ozelius6, I. Peter6, A. Darvasi7, K. Offit5, H. Ostrer8, J. Cho4, L. Clark3, G. Atzmon8, T. Lencz2, The Ashkenazi Genome Consortium 1) Dept Computer Sci, Columbia Univ, New York, NY; 2) Dept. of Psychiatry Research, Feinstein Institute of Medical Research, Manhasset, NY; 3) Dept of Pathology, Columbia Univ, New York, NY; 4) Dept. of Genetics, Yale University, New Haven, CT; 5) Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY; 6) Dept. of Genetics and Genomics Sciences, Mt. Sinai School of Medicine, New York, NY; 7) Dept. of Genetics, Hebrew University, Jerusalem, Israel; 8) Dept. of Genetics, Albert Einstein College of Medicine, Bronx, NY.
The Ashkenazi Jewish (AJ) population, currently including ~10 million individuals, has long been recognized as genetically isolated and therefore advantageous for genetic studies. Recent availability of GWAS data on thousands of AJ samples allows quantification of the isolation of this group, evaluation of its utility for sequencing studies, followed by pursuit of such WGS efforts. To perform such evaluation, we have developed novel methodology for inference of population genetic history based on the distribution of length and recent mutations in segments that are identical by descent (IBD), as observed by sequencing and SNP array data. We show such methodology to be uniquely effective in reconstructing recent demography, compared to previous methods more focused at pre-historic times. Applying this methodology to data from self-identified AJ samples, we show 85-90% of them belong to a genetic isolate related to other Mid-Eastern populations. This group has experienced an extreme bottleneck 30-35 generations ago, with subsequent expansion greatly exceeding the growth rate across all humans. Data are consistent with bottleneck size of merely 400 founders. This means that AJs are a relatively large group that is tractable for current sequencing throughput, with favorable study-size economics compared to other populations: several hundred individuals sequenced are expected to provide IBD segments to impute all common and rare variants in millions personal AJ genome, save the hundreds in each personal genome that are due to mutations in modern times. The Ashkenazi Genome Consortium (TAGC) has taken on this task with Phase I of the project now in progress, including 137 complete genomes of multiple-disease controls. Pilot TAGC samples show favorable QC measures (Ti/Tv=2.15.003). We observe the total number of variants to be consistent with other European populations sequenced using the same platform and pipeline (SNV heterozygosity of 7.1x10-4), but a significant increase in the fraction of novel heterozygote variants observed for SNVs (24% increase, p<0.0017) and other variants, as expected for a rapidly expanding isolated population, underrepresented in SNP databases. Variants detected in AJ samples, absent in a same-size group of European samples on the same platform tend to be shared across AJ samples, compared to European-detected variants, absent in AJ, consistent with a demographic effect of bottleneck in AJs, rather than with sequencing artifacts.
You may contact the first author (during and after the meeting) at