Optimized Exome Sequencing for Discovery Research: Improved Metrics and Methods to Enhance Variant Discovery Across the Biomedical Footprint of the Genome. M. Pratt, S. Luo, G. Bartha, J. Harris, N. Leng, C. Haudenschild, R. Chen, J. West Personalis, Menlo Park, CA., USA.

   Whole exome sequencing (WES) is a broadly used technique for variant detection and discovery in a wide range of research study types, yet metrics to measure system-level sensitivity of these assays (vs. metrics such as average depth of coverage) are rarely used. We have developed alternative methods for assessing variant detection sensitivity, and have optimized an augmented exome protocol to efficiently detect variants across a large biomedical footprint by leveling coverage and targeting minimum rather than average read depth. We have compared multiple capture and sequencing protocols to determine a preferred assay configuration to efficiently discover variants in research studies across the biomedically relevant content of the genome. Using a standard sample (NA12878) run at high depth on two different exome platforms, a titration of data sets of decreasing sequencing volume were created by randomly downsampling reads. Variant detection was performed utilizing the Personalis pipeline for all assays. Variant calls were evaluated over a prioritized content set including clinical genes, genes with phenotype associations, UTRs, splice and non-coding clinical variants, GWAS variants and highly conserved loci near previously identified biomedical content. Variant detection sensitivity was determined by comparing to an internal gold set of variants previously characterized within this sample by repeated high-depth whole genome sequencing on multiple platforms. In addition, we utilized the NIST genome-in-a-bottle call set on a reduced footprint for confirmation. Using an empirically determined sensitivity by depth function, we assess the effective sensitivity and discovery footprint of each configuration and find an optimum. We found that the augmented exome was the most efficient assay for variant discovery at almost all levels of sequence data analyzed. We also derived the nature of the variant discovery curves for a standard v. augmented exome, and utilized this to estimate the optimal use of sequencing data across a large sample set for cost-effective variant discovery.

You may contact the first author (during and after the meeting) at