Variation data services at NCBI: archives, tools, and curation for research and medicine. S. Sherry, K. Addess, V. Ananiev, C. Chen, D. Church, M. Feolo, J. Garner, T. Heffron, D. Hoffman, M. Kholodov, A. Kitts, J. Lee, J. Lopez, D. Maglott, R. Maiti, L. Phan, G. Riley, W. Rubinstein, D. Rudnev, Y. Shao, E. Shekhtman, K. Sirotkin, D. Slotta, R. Tully, R. Villamarin-Salomon, Q. Wang, M. H. Ward, H. Zhang National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD.
NCBI operates several archives for sequence variation data that are increasingly relevant for medical research. These archives contain general variation data in the public domain and high density surveys of genetic diversity in medical study populations. Small scale variations are accessioned and distributed through dbSNP at http://www.ncbi.nlm.nih.gov/snp/ and larger structural variations (>50bp) are accessioned and distributed through dbVar at http://www.ncbi.nlm.nih.gov/dbvar/. Research participant genotypes, participant phenotypes and analysis results are distributed through the dbGaP controlled access system http://www.ncbi.nlm.nih.gov/dbgap/. Assertions of clinical significance for variants and alleles are accessioned and distributed through ClinVar at http://www.ncbi.nlm.nih.gov/clinvar/.
Tools for exploring and visualizing variation data include the 1000 genomes browser http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/, the 1000 genomes data slicer http://trace.ncbi.nlm.nih.gov/Traces/1kg_slicer/ and the phenotype-genotype data integrator http://www.ncbi.nlm.nih.gov/gap/PheGenI. The clinical remap tool at http://www.ncbi.nlm.nih.gov/genome/tools/remap#tab=rsg will provide sequence coordinates for variations on a clinical RefSeqGene record, and the variation reporter service at http://www.ncbi.nlm.nih.gov/variation/tools/reporter will provide a list of known variants and the functional consequences for a region of interest in BED format or set of variants in HGVS or GVS format.
The presentation will also report the results of NCBIs participation with the GET-RM consortium to establish standardized sequence data for next generation sequencing in clinical laboratories and to identify lists of variants with clinical phenotypes, and lists of common variants not known to be medically important. The latter are frequently used to filter normal variation from next generation sequencing results.
You may contact the first author (during and after the meeting) at