Genetic estimation of biogeographical ancestry. C.L. Pfaff, E.J. Parra, M.D. Shriver. Pennsylvania State University, University Park, PA.
Ethnicity is comprised of both biological and cultural components. Biogeographical ancestry (BGA) refers to the component of ethnicity that is biologically determined and can be estimated using genetic markers that have distinctive allele frequencies for the populations in question (referred to as population-associated alleles - PAAs). We have developed a method that uses a maximum likelihood (ML) approach to estimate the primary population source(s) of an unknown DNA sample. Once the potential source populations have been narrowed to the two or three populations with the highest log likelihood ratios (LLR), individual admixture proportions are estimated for the multilocus genotype observed in order to characterize the proportional ancestry of the sample.
We have explored the potential of this method by examining the multilocus genotypes of African, European, and African-American DNA samples using a panel of 10 PAAs that have high allele frequency differences between Africans and Europeans. In each of the 906 African and European samples BGA was correctly estimated using a maximum likelihood approach. In 863 cases the LLR for the estimation was > 3 (avg. LLR = 4.7), indicating a strong confidence in the estimation of ancestry. However, as expected, the ML estimate is less precise for African-American samples. In these cases the inaccurate and low-confidence estimates tend to be for individuals with relatively high admixture proportions, making population distinctions more difficult. A second source of inaccuracy is the relatively low number of informative markers currently available. In order to examine this cause, we simulated 2000 individuals with multilocus genotypes at 20 loci. Of these, ancestry estimation was correct in every case, and only 2 individuals had LLR <3. While the utility of this method is currently limited by the restricted number of PAAs available for various populations, it is clear that as larger numbers of ancestry-informative markers become available, estimation of BGA may become a powerful tool for the elucidation of an individual's genetic and population history, as well as the identification of unknown samples in forensic cases.