Posted By: The American Journal of Human Genetics, AJHG
Each month, the editors of The American Journal of Human Genetics interview an author of a recently published paper. This month, we check in with Florin to discuss his recent paper, “Exploring the omnigenic architecture of selected complex traits.”

AJHG: What motivated you to start working on this project?
FR: I had previously developed the graph machine learning framework, Speos, which specializes in the prediction of core genes across several traits. The respective paper, however, primarily focused on benchmarking, validation, and prediction accuracy. Although it is important to validate new tools, I think the academic overemphasis on benchmarking misses the point of the exercise: gaining new insights into biology. Most current research is so focused on method development that it forgets that a tool is only useful when it is actually put into practice. Therefore, I wanted to follow up with a paper to demonstrate the insights that can be garnered through interpretable machine learning, once the focus is not on the method itself.
AJHG: What about the paper/project most excites you?
FR: What excites me the most is that we could actually derive human-interpretable and sensible high-level patterns from large, chaotic datasets. Neural networks are often seen as a black box, but they are not. For example, our results clearly show that high expression in disease-relevant tissues is the driving factor behind the prediction of core genes. This fosters much more trust in machine learning tools than simply beating a previous benchmark in some artificial metric, because it aligns perfectly with the theory of the omnigenic model and shows that the learned patterns are actually rooted in biology. I am further impressed by the genome-wide patterns of discriminative perturbations between core genes and peripheral genes, which have by far exceeded our expectations.
AJHG: Thinking about the bigger picture, what implications do you see from this work for the larger human genetics community?
FR: Methodologically, I hope that machine learning tools will find application beyond mere prediction tasks. Our paper shows that there is far more to gain when we thoroughly examine their learned patterns and interpret them. This would also counteract the problematic trend of using immensely large and convoluted datasets for disease gene prediction, as training, for example, on the top principal components of dozens of gene expression datasets, makes such models inherently uninterpretable.
Beyond that, I think our results make it clear that the “omni” in the omnigenic model can be taken literally, and that we can indeed expect an almost genome-wide influence on many traits. This would emphasize larger perturbation screenings, but also further questions the common practice of selecting a group of genes, i.e., by GWAS p-value thresholds, and simply ignoring the rest. Here, solutions integrating more sources of data, such as ours, will provide immense benefits. Especially given our finding that this genome-wide architecture appears to be cell-type specific, I hope to see a wider range of cell types available for large-scale perturbation screenings in the future.
AJHG: What advice do you have for trainees/young scientists?
FR: My advice for trainees and/or young scientists would be to always trust their gut feeling. Even if a research topic seems very productive, measured by the number of publications and citations, it might become stale very quickly if the underlying principles aren’t sound. Academics are usually very good at convincing you using seemingly rational arguments, but your gut will be able to tell when something is off. If another research topic seems rather niche, but you feel drawn to it because you see something in it that others might be missing, then this is likely a better use of your time.
AJHG: And for fun, tell us something about your life outside of the lab.
FR: At home, I love to cook for my little daughter, my wife, and our dog, build useful stuff from wood, or use my self-made 3D printer. I also love the outdoors, whether I am going hunting, hiking, or simply sitting around a campfire with friends.
Florin Ratajczak, MS, is a PhD Student at the Institute of Network Biology and the Institute of Computational Biology at Helmholtz Munich.