Analysis of stop-gain and frameshift variants in human innate immunity genes. A. Rausell1,2,3,4, P. Mohammadi1,5, PJ. McLaren1,2,6, I. Bartha1,2,6, I. Xenarios1,4,7, J. Fellay1,6, A. Telenti2,3 1) Swiss Institute of Bioinformatics (SIB) and University Hospital of Lausanne, Lausanne, Vaud, Switzerland; 2) Department of Laboratories, University Hospital of Lausanne, Switzerland; 3) University of Lausanne, Lausanne, Switzerland; 4) Vital-IT group, SIB Swiss Institute of Bioinformatics Lausanne, Switzerland; 5) Computational Biology Group, ETH Zurich, Switzerland; 6) School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; 7) Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

   Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to innate immunity genes we evaluated 17764 stop-gain and 13915 frameshift variants from the NHLBI Exome Sequencing Project and 1000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and nonsense-mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against Online Mendelian Inheritance in Man (OMIM) disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores. Specifically, the sequence-based score improves measurement of functional gene impairment, discriminates across different variants in a given gene and appears particularly useful for analysis of less conserved genes.

You may contact the first author (during and after the meeting) at