Alignment to an Ancestry Specific Reference Genome Discovers Additional Variants Among 1000 Genomes ASW Cohort. R. A. Neff, J. Vargas, G. H. Gibbons, A. R. Davis Cardiovascular Disease Section, GMCID, National Human Genome Research Institute, Bethesda, MD.
Whole genome sequencing studies across certain populations, such as those with African ancestry, are often underpowered due to a larger divergence between the common reference genome and the true genetic sequence of the population. However, a common reference genome is not designed to account for this divergence in population-specific studies. Strong signals from common (MAF50%) single nucleotide polymorphisms (SNPs), insertion-deletions (indels), and structural variants (SVs) can make alignment and variant calling difficult by masking nearby variants with weaker genetic signals. We present the results generated from alignment to an African descent population-specific reference genome by applying variants present in a majority of individuals with African descent from all phases of the 1000 Genomes Project and the International HapMap Consortium. We identified 882,826 single nucleotide polymorphisms, short insertion-deletion events, and large structural variations present at MAF50%; in the population, representing 2.39 MB of genetic variation changed from hg19. We demonstrate that utilization of a population-specific reference improves variant call quality, coverage level, and imputation accuracy. We compared alignment of 27 African-American SW population (ASW) samples from the 1000 Genomes Phase 1 project between the population-specific and the hg19 reference. We discovered an additional 443,036 SNPs by alignment to the population specific reference in union across all samples, including thousands of exonic variants that are non-synonymous and are clinically relevant to the study of disease.
You may contact the first author (during and after the meeting) at