Posted By: Sara Cullinan, PhD, Deputy Editor, AJHG
Each month, the editors of The American Journal of Human Genetics interview an author of a recently published paper. This month we check in with Sriram (@sr_sankararaman) to discuss his paper “Fast estimation of genetic correlation for biobank-scale data”
What prompted you to start working on this project?
SS: For a few years now, I have been thinking about Biobank-scale methods that can allow us to learn about aspects of the genetic architecture (e.g. heritability) of complex traits from millions of genomes. A major challenge in designing such methods is a computational one: how do we design these methods that can scale?
A few years back, we developed a scalable variance components method that can estimate and partition heritability. The key insight is to reduce the size of the genotype data that we need to compute on by multiplying the genotype matrix with random vectors. To our surprise, we get accurate estimates with only a few random vectors which, in turn, allows the method to be highly scalable. These results led us to think about other applications where the idea of random projections could be useful. Genetic correlations are often estimated by fitting variance components models to pairs of traits so that seemed like a natural application to try.
- What about this paper/project most excites you?
SS: The surprising aspect thing for me is the accuracy of random projections. This, in turn, allows our method, SCORE, to provide accurate estimates of genetic correlation while being highly scalable.
- Thinking about the bigger picture, what implications do you see from this work for the larger human genetics community?
SS: The ideas that we use in our work give us a powerful, general way to harness the potential of large datasets to learn about human genetics.As one of the reviewers commented on our submission: “the work should make the entire field of statistical genetics hit themselves on the head in a “why didn’t I think of that?” manner.”! As genetic datasets grow in size and scope, the statistical and computational challenges are also going to grow so we need to design methods that can overcome these bottlenecks to enable new discoveries.
- What advice do you have for trainees/young scientists?
SS: Some of the most interesting questions are at the interface of disciplines (in my case: statistics, computer science and genomics). To be able to work at the interface can be challenging in the short-term (you need to obtain expertise in two or more fields), but is rewarding in the long-term.
- And for fun, tell us something about your life outside of the lab.
SS: I like to play the flute to relax.