The Human Knockout Project: systematic discovery of loss-of-function variants in humans. K. J. Karczewski1,2, V. Narasimhan3, M. Lek1,2, M. Rivas4, S. Balasubramanian5, M. Gerstein5, B. Keating6, T. Lappalainen7, A. Palotie1,2, M. Daly1,2, D. van Heel8, R. Trembath8, R. Durbin3, D. G. MacArthur1,2 1) Massachusetts General Hospital, Boston, MA; 2) Broad Institute, Cambridge, MA; 3) Wellcome Trust Sanger Institute, Hinxton, UK; 4) Wellcome Trust Centre for Human Genetics Research, University of Oxford, Oxford, UK; 5) Yale University, New Haven, CT; 6) University of Pennsylvania, Philadelphia, PA; 7) New York Genome Center, New York, NY; 8) Queen Mary University of London, London, UK.

   Every human carries at least a hundred loss-of-function (LoF) variants predicted to severely disrupt the function of protein-coding genes, including many in the homozygous state. These variants represent experiments of nature that can cast light on the function of currently uncharacterized human genes: indeed, much novel biology has already been learned from the involvement of rare LoF variants in severe Mendelian disease. Additionally, these variants have also proved valuable in identifying potential therapeutic targets: LoF variants in PCSK9 have been causally linked to low LDL cholesterol levels, leading to the development of PCSK9 as a therapeutic target for cardiovascular disease. However, discovering LoFs in the human population remains a significant challenge, as these variants are enriched for sequencing and annotation errors, and typically have very low frequency, confounding their discovery and interpretation. For this reason, large sample sizes are required to discover LoFs in every possible gene. Alternatively, two distinct strategies, the use of bottlenecked populations and populations with a high rate of consanguineous mating, are established to significantly enrich for discovery of homozygous rare LoF variants, effectively identifying knockout humans. To this end, we have developed an open-source tool, LOFTEE (Loss Of Function Transcript Effect Estimator), to annotate loss-of-function variants. In order to characterize the landscape of homozygous LoF variants (knockouts) across humans, we have applied LOFTEE to a number of large datasets from collaborative efforts, including over 91,000 exomes aggregated from a variety of rare and complex disease consortia, deeply phenotyped samples from Finnish national biobanks, and over 1,000 parentally-related individuals from the UK. We validate these methods using databases of known disease variants, and investigate the role of LoF variants on splicing and gene expression by intersecting exomes with matched RNA-Seq data from over 500 individuals from the GTEx consortium. Finally, we describe the aggregation of these variants into a database of LoF variants, dbLoF, providing a resource for pharmaceutical development, transplant biology, and understanding of rare Mendelian diseases.

You may contact the first author (during and after the meeting) at