Insights into protein truncating variation from high-quality indel calling in 1000 UK population exomes - implications for disease gene discovery and clinical utility. E. Ruark1, A. Renwick1, E. Ramsay1, S. Seal1, K. Snape1, S. Hanks1, A. Rimmer2, M. Munz2, A. Elliott1, G. Lunter2, N. Rahman1 1) Institute of Cancer Research, Sutton, United Kingdom; 2) Wellcome Trust Centre for Human Genomics, Oxford, UK.

   The advent of next-generation sequencing (NGS) has provided the potential to comprehensively capture coding variation in large numbers of individuals fast and affordably. This has furthered both research to discover gene mutations that predispose to disease and opportunities to expand clinical gene testing. However, in both contexts, the spectrum of coding variation in the general population is a necessary consideration when evaluating the impact on association with disease. Many disease predisposition genes are characterized by multiple different mutations that result in premature protein truncation, termed protein truncating variants (PTV) or loss-of-function variants (LoF). The predominant mechanism for generating PTVs is base insertions or deletions (indels). Unfortunately, accurate detection of indels in short-read NGS data has proved challenging, with sub-optimal sensitivity and specificity and low concordance between callers. We have developed and validated a pipeline for exome analysis (base substitutions and indels) with 95% sensitivity and 94% specificity for indels. Application of the pipeline to 1000 UK population controls unselected with respect to disease reveals multiple insights into PTV architecture. PTVs were identified in one third of genes (5627/17588); however, for the majority (51%), one PTV in one individual was identified. Only 139 genes (0.8%) had multiple (5 or more), different PTVs in the 1000 individuals. Furthermore, these data have allowed us to describe the genetic variation of the average UK individual, who carries 22,000 coding variants of which 160 are rare ( 0.1% of the population). On average each person in UK has 211 PTVs of which 6 are rare ( 0.1%) and 91 are homozygous. The data also provide a framework for disease gene identification and clinical characterisation studies. Most importantly, we show that rare protein truncating variants are an expected part of the normal spectrum of an individuals genomic variation. As such, more caution than typically applied is appropriate in the evaluation of the likely causal link between a rare PTV and a phenotype in a given individual. Conversely, multiple different PTVs within the same gene is an unusual pattern in the general population, but a common pattern in people with genetic diseases, and may serve as a useful mutational signature in disease gene discovery studies.

You may contact the first author (during and after the meeting) at