Simultaneous estimation of population size changes and splits times from population level resequencing studies. M. Forest, J. Marchini, S. Myers Department of Statistics, University of Oxford, Oxford, United Kingdom.
In the quest to understand human evolution, key questions might refer to the time of divergence of different human populations, and to the variation of the population sizes through time. In the recent past, genetic data have proven to offer a complementary insight to archeological discoveries in regard to population histories. Large genomic projects are now offering access to high quality data from a vast number of populations (ex. the 1,000 Genomes Project (1KGP)). Our aim is to study such population structure using large samples of genomic sequencing data from different populations. We have developed an approach that builds trees at thousands of loci, and uses these to infer demographic history allowing for arbitrary population splits, and size changes, over a series of epochs in the past, a previously unsolved problem. By jointly analysing hundreds of individuals, we show by simulation and real world application that we can accurately estimate population separation times and sizes from only a few thousand, to hundreds of thousands of years in the past. Our approach extends the Stephens and Donnelly importance sampler, to allow estimation of the divergence time and population sizes. The method is able to jointly utilise data from multiple regions that show very low levels of recombination (cold spots) in total covering hundreds of megabases. An iterative scheme allow us to: (1) obtain point estimates of the likelihood function using the coalescent process and the built genealogies, and (2) obtain maximum likelihood estimates of the effective population sizes using the previously built genealogies. The population sizes are modelled as piecewise constant and are allowed to vary freely in between different epochs. An optimization algorithm allows us to quickly find the maximum of the estimated likelihood function. We have applied the method to different populations from the 1KGP using more than 2,000 cold regions of the genome (average length of 30Kb). By analysing many pairs of populations using 1KGP sequencing data, we elucidate details of the relationships among multiple human groups, and changes in their effective population sizes, from a few thousand years ago. Our results unify and extend previous results on the split times between European groups, and among Europe, Africa and Asia, shared and non-shared bottlenecks in out-of-Africa groups and expansions following population separations, and the sizes of ancestral populations further back in time.
You may contact the first author (during and after the meeting) at