Understanding molecular mechanisms of human disease mutations and coding variants through 3D protein networks. H. Yu1,2, J. Das1,2, Y. Guo1,2,3, X. Wei2,4, X. Wang2,3, B. Thijssen5, A. Grimson3, S. M. Lipkin4, A. G. Clark1,3 1) Biological Statistics and Computational Biology, Cornell University, Ithaca, NY; 2) Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY; 3) Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY; 4) Department of Medicine, Weill Cornell College of Medicine, New York, NY; 5) Department of Bioinformatics, Maastricht University, 6200 MD Maastricht, The Netherlands.
To better understand the molecular mechanisms and genetic basis of human disease, we systematically examine relationships between 3,949 genes, 62,663 mutations, and 3,453 associated disorders by generating a 3D structurally resolved human interactome. This network consists of 4,222 high-quality binary protein-protein interactions with their atomic-resolution interfaces. We find that in-frame mutations (missense mutations and in-frame insertions/deletions) are enriched on the interaction interfaces of proteins associated with the corresponding disorders, and that the disease specificity for different mutations of the same gene can be explained by their location within an interface. We also predict 292 candidate genes for 694 unknown disease-gene associations when a known disease protein interacts with our newly predicted candidate at the interface where corresponding disease-specific mutations are highly enriched. By considering the dominance/recessiveness of the disease mutations, we further find that although recessive mutations on the interaction interface of two interacting proteins tend to cause the same disease, this widely accepted guilt-by-association principle does not apply to dominant mutations. Furthermore, recessive truncating mutations (nonsense mutations and frameshift insertions/deletions) on the same interface are much more likely to cause the same disease, even if they are close to the N-terminus of the protein; whereas dominant truncating mutations tend to be enriched between interfaces. These results suggest that a significant fraction of truncating mutations can generate functional protein products, contrary to the common belief that truncating mutations most often cause complete loss of function. Finally, we find that rare non-synonymous coding variants are significantly enriched at the interaction interface, compared to common ones, indicating that our approach could be particularly effective in assessing the functional relevance of thousands of coding variants on a genomic scale.
You may contact the first author (during and after the meeting) at