A statistical approach to distinguish genetic pleiotropy from clinical heterogeneity: application to autoimmune diseases. B. Han1-3, D. Diogo1-4, E. A. Stahl5, S. Eyre6,7, S. Rantapńń-Dahlqvist8, J. Martin9, T. W. Huizinga10, P. K. Gregersen11, J. Worthington6,7, L. Klareskog12, P. I. W. de Bakker13,14, S. Raychaudhuri1-4,6 1) Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; 2) Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; 3) Partners HealthCare Center for Personalized Genetic Medicine, Boston, MA 02115, USA; 4) Division of Rheumatology, Immunology, and Allergy, Brigham and Women's, Hospital, Harvard Medical School, Boston, Massachusetts, 02115, USA; 5) The Department of Psychiatry, Mount Sinai School of Medicine, New York, New York, USA; 6) Arthritis Research UK Epidemiology Unit, Musculoskeletal Research Group, University of Manchester, Manchester Academic Health Sciences Centre, Manchester M13 9PT, UK; 7) NIHR Manchester Musculoskeletal Biomedical Research Unit, Central Manchester NHS Foundation Trust, Manchester Academic Health Sciences Centre, Manchester M13 9PT, UK; 8) Department of Public Health and Clinical Medicine / Rheumatology, Umeň University, S-901 85 Umeň, Sweden; 9) Instituto de Parasitologia y Biomedicina Lopez-Neyra, Consejo Superior de Investigaciones Cientificas (CSIC), 18100 Armilla, Granada, Spain; 10) Department of Rheumatology, Leiden University Medical Centre, 2300 RC Leiden, The Netherlands; 11) The Feinstein Institute for Medical Research, North Shore-Long Island Jewish Health System, Manhasset, NY 11030, USA; 12) Rheumatology Unit, Department of Medicine, Karolinska Institutet and Karolinska University Hospital Solna, SE-171 76 Stockholm, Sweden; 13) Department of Epidemiology, University Medical Center Utrecht, 3584 CG Utrecht, The Netherlands; 14) Department of Medical Genetics, University Medical Center Utrecht, 3584 CG Utrecht, The Netherlands.

   Motivation: Recent studies have demonstrated that many medically relevant phenotypes have a shared genetic structure. For example, many autoimmune diseases have shared alleles and exhibit cross-heritability, but it is uncertain whether this is the consequence of a common genetic basis (pleiotropy) or the consequence of clinical heterogeneity. Clinical heterogeneity occurs when a presumably phenotypically homogeneous patient cohort consists of genetically distinct subgroups, either (1) because different phenotypes were misclassified as one or (2) the diagnosed trait in a subset of individuals was caused by another trait. Method: We developed a novel statistical approach to distinguish genetic pleiotropy from clinical heterogeneity. Our method examines a patient cohort to assess if there is evidence of a subgroup that is enriched for the risk alleles for a second trait compared to the rest of the cohort. Results: Based on simulations, we demonstrate that our approach has 90% power to detect clinical heterogeneity with 50 risk alleles in 2,000 samples. We applied this approach to seronegative (CCP-) rheumatoid arthritis (RA), which is known to share genetic structure with seropositive (CCP+) RA. We examined 71 CCP+ RA risk alleles in 3,273 CCP- RA cases, and identified statistically significant clinical heterogeneity (P=0.003). Since the two RA subtypes are not causal to each other by definition, this was evidence of misclassifications. Specifically, our method suggested that 24% of CCP- RA cases were likely to be misclassified CCP+ RA patients, which was consistent with our previous observations that the shared genetic structure between the two RA subtypes might be in part attributable to misclassifications (Han et al. AJHG 2014). On the other hand, examining CCP+ RA risk alleles within WTCCC cases of Type 1 diabetes, also known to share genetic structure with RA, we observed no evidence of clinical heterogeneity (P=0.8), suggesting that these two conditions have true pleiotropic genetic effects. Conclusions: Our statistical approach effectively distinguishes pleiotropy from clinical heterogeneity. This is a key advantage compared to previous approaches to assess shared genetic structure, such as polygenic modeling or Mendelian randomization, which are both unable to make this distinction. Our method is widely applicable to misdiagnosis detection and causal inference between traits.

You may contact the first author (during and after the meeting) at