Assessing multivariate gene-metabolome associations using Bayesian reduced rank regression. P. Marttinen1, M. Pirinen2, A.-P. Sarin2,3, J. Gillberg1, J. Kettunen2, I. Surakka2, A. J. Kangas4, P. Soininen4,5, T. Lehtimäki6, M. Ala-Korpela4,5,14, O. T. Raitakari7,8, M.-R. Järvelin9,10,11,12,13, S. Ripatti2,3,15,16, S. Kaski1 1) Helsinki Institute for Information Technology (HIIT), Aalto University and University of Helsinki, Finland; 2) Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Finland; 3) Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Finland; 4) Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, Finland; 5) NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland; 6) Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland; 7) Department of Clinical Physiology and Nuclear Medicine, University of Turku and Turku University Hospital, Turku, Finland; 8) Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland; 9) Department of Epidemiology and Biostatistics, MRC Health Protection Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College London, United Kingdom; 10) Institute of Health Sciences, University of Oulu, Finland; 11) Biocenter Oulu, University of Oulu, Finland; 12) Unit of Primary Care, Oulu University Hospital, Finland; 13) Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland; 14) Computational Medicine, School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom; 15) Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK; 16) Hjelt Institute, University of Helsinki, Helsinki, Finland.

   A typical genome-wide association study searches for associations between SNPs and a univariate phenotype. However, there is a growing interest to investigate associations between genomics data and multivariate phenotypes, for example, gene expression or metabolomics data. The commonly used approach is to perform a univariate test between each genotype-phenotype pair, and then to apply a stringent significance cutoff to account for the large number of tests performed. However, this approach may have limited ability to uncover dependencies involving multiple variables. Another trend in the current genetics is the investigation of the impact of rare variants on the phenotype, where the standard methods often fail due to the lack of power when the risk allele is present in only a limited number of individuals. Here we propose a novel approach based on Bayesian reduced rank regression to assess the impact of multiple SNPs on a high-dimensional phenotype. Due to the method's ability to combine information over multiple SNPs and phenotypes, our method is particularly suitable for detecting associations involving rare variants. We demonstrate the potential of our method by analyzing every gene in a sample of 4,702 individuals from the Northern Finland Birth Cohort 1966, for whom whole-genome SNP data along with lipoprotein profiles comprising 74 traits are available. Using our new method, we discovered three putative loci without previously reported associations with the traits studied, which replicated in a sample of 2,390 individuals from the Cardiovascular Risk in Young Finns study.

You may contact the first author (during and after the meeting) at