A Unified Analysis Framework for Detecting Genetic Variations from Next-Generation Sequencing Data. C. Xiao, S. Sherry NIH/NLM/NCBI, bethesda, md.
The per-base cost of whole genome sequencing has dropped significantly due to recent advancement of next generation sequencing (NGS) technologies, and a number of large-scale re-sequencing projects, e.g. 1000 Genomes, ICGC, TCGA, GO-ESP, and CGCI etc., have been initiated to extend our knowledge of single nucleotide polymorphisms (SNPs), short insertions/deletions (INDELs) and structural variations (SVs), and relate these variants to human diseases. However, data generation and analysis in a timely fashion present numerous challenges to researchers. In order to facilitate NGS data analysis in biomedical research, NCBI develops an integrated analysis framework (VPIPE) to profile genetic mutations from next-generation sequencing data in a uniform manner. This pipeline manages parallel-computing resources, aligns short reads to the reference genome sequences, refines the mapping of placed reads, calls SNPs, INDELs, and SVs, and performs de novo assembly and functional annotation according to data availability and project-specific policies. A centrally implemented pipeline streamlines the data processing workflow for the data generated by next-generation sequencing technologies.
You may contact the first author (during and after the meeting) at