Computational prediction and in vivo validation of suppressors of human disease mutations. D. M. Jordan1,2, E. E. Davis3, N. Katsanis3, S. R. Sunyaev2 1) Biophysics Program, Harvard University, Cambridge, MA; 2) Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA; 3) Center for Human Disease Modeling, Department of Cell Biology, Duke University Medical Center, Durham, NC.
Predicting the phenotypic effects of genetic variation is an important and widespread problem in modern genetics, with applications in gene and variant discovery studies, population genetics studies, and clinical genetic diagnostics. There are a number of computational tools that exist for this task, such as PolyPhen and SIFT. These methods rely on a comparative genomics approach, building a multiple sequence alignment and using it to assess the behavior of the variant over evolutionary time. This approach assumes that a variant observed in another species will only very rarely cause disease in humans. We use the training data for PolyPhen to evaluate how faulty this assumption is, by assessing how many variants known to cause disease in humans are found in the reference sequences of other species. We find that nearly 10% of variants annotated as pathogenic in humans appear in the reference sequence of at least one other vertebrate species, indicating that the assumption that human disease mutations are not found in the genomes of other species is significantly violated. This result holds even after filtering disease annotations to a very high level of confidence. The pattern of variation in genes where these variants occur is consistent with a model where a single effectively neutral change in the same gene can act as a compensation for the disease-causing variant, which we show through simulation. Based on this model, we use a combination of comparative genomics data and computational prediction of structural stability to generate candidate pairs of disease-causing variants and compensatory changes, which can then be directly tested by experiment. We report on this dataset of candidate variant pairs and the preliminary results of in vivo validation with a zebrafish morpholino rescue model. The ability to identify the interacting partners of specific variants allows us both to explore the shortcomings of the comparative genomics approach to variant assessment, and to uncover new biology relating to known human disease mutations.
You may contact the first author (during and after the meeting) at