Slicing the Genome: A New Approach to Association in Complex, Longitudinal Diseases. A. Musolf1, D. Londono1, A. Q. Nato, Jr.2, P. Vuistiner3, C. A. Wise4,5,6, L. Yu1,7, S. J. Finch8, P. Bovet9, M. Bochud3, T. C. Matise1, D. Gordon1 1) Department of Genetics, Rutgers University, Piscataway, NJ, USA; 2) Division of Medical Genetics, University of Washington, Seattle, WA, USA; 3) Swiss Institute of Bioinformatics, Lausanne, Switzerland; 4) Seay Center for Musculoskeletal Research, Texas Scottish Rite Hospital for Children, Dallas, TX, USA; 5) Department of Orthopedic Surgery, Texas Scottish Rite Hospital for Children, TX, USA; 6) Department of Orthopaedic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA; 7) Center of Alcohol Studies, Rutgers University Piscataway, NJ, USA; 8) Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA; 9) Unit for Prevention and Control of Cardiovascular Disease, Section of Non Communicable Diseases, Ministry of Health and Social Services, Seychelles.
We previously published a method that tests for association between a longitudinal phenotype and genetic variants. The method uses growth mixture models (GMM) to determine longitudinal trajectory curves. The Bayesian posterior probability (BPP) of belonging to a specific curve, an outcome variable from the GMM, is used as a quantitative phenotype in association analyses. Though the method proves to be powerful for a single causal variant under multiple inheritance scenarios, power significantly decreases when more than one causal variant is considered. Here, we present a new method designed to detect multiple causal SNPs associated with longitudinal phenotypes in both family and population studies. The method also allows for the incorporation of covariates. This novel method retains several ideas from our first method, however instead of performing individual association tests with each SNP, we slice the genome into non-overlapping blocks of 50 SNPs (which we term a "mega-locus") and obtain a significance value on each mega-locus. This is accomplished via the SumStat method, developed by Jurg Ott and colleagues. As SumStat works for population studies only, we use a modified procedure (TDT-HET) to test for family-based association. We consider various scenarios in our simulations, including four causal variants located within a single mega-locus and eight causal variants spread between two mega-loci on different chromosomes. We also introduce environmental covariates. Our data set is highly stratified to ensure robustness in the presence of population stratification. P-values for each mega-locus on each data set are computed. To adjust for multiple testing, the final p-values are combined via Fishers method (per mega-locus) and by the false discovery rate (FDR). We report that our simulations: 1) appear to maintain the proper type I error and 2) have greater than empirical 75% power for most simulations. These results suggest that our method can detect multiple causal SNPs located in multiple regions across the genome. We believe that this method will be useful to researchers who are studying complex diseases with longitudinal phenotypes. It allows for potentially high power for association of causal loci with disease progression phenotypes for both population and family studies, even in the presence of confounding elements such as population stratification and environmental variables.
You may contact the first author (during and after the meeting) at