Discoveries from a Genome-Wide Analysis of CNVs in the PGC Study of Schizophrenia. J. Sebat1, C. R. Marshall2, D. Howrigan3, D. Merico1, B. Thiruvahindrapuram2, W. Wu1, M. O'Donovan4, S. Scherer2, B. Neale3, Schizophrenia and CNV analysis groups of the Psychiatric Genomics Consortium (PGC) 1) Department of Psychiatry & the Institute for Genomic Medicine, University of California San Diego, La Jolla, CA; 2) The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto Canada; 3) Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA; 4) Cardiff University, Cardiff, UK.
Copy number variants (CNVs) throughout the genome contribute to schizophrenia, and evidence has been obtained for several CNV loci. However, the full extent of the CNV contribution to risk is unknown. Previous studies have lacked adequate power to detect genetic association for CNVs with low frequencies (MAF<0.001) or intermediate effect sizes (OR = 2-10). Identification of such risk factors requires sample sizes achievable only through large scale collaboration. To this end we developed a centralized CNV analysis pipeline, composed of multiple available calling tools, and applied it to the investigation of CNVs in a cohort of 46,288 subjects (23,650 cases and 22,638 controls) from the Psychiatric Genomics Consortium study of Schizophrenia. Following the processing of raw data, samples were filtered within datasets based on array QC metrics (probe variance, GC bias and aneuploidy). A consensus CNV call set was generated from the intersection of multiple callers and CNVs were filtered within each dataset based on frequency (<1%), probe density, size, overlap with segmental duplications or Immunoglobulin & TCR regions. A set of appropriate covariates for analysis was identified by examining the correlation of QC metrics with case status and CNV burden across datasets. Analysis of CNV burden was performed genome-wide and within functional gene sets. Genetic association was carried out as single marker (breakpoint) and gene-based tests (collapsing rare variants). Association was tested by logistic regression with covariates, and empirical P-values were estimated by permutation and converted to Z-scores to refine the accuracy of empirical significance. Appropriate thresholds for genome-wide significance were estimated by permutation. Results reveal a robust contribution of CNV to disease risk that is consistent across a wide range of microarray platforms and studies. CNV burden was enriched among gene sets involved in neurological function including the postsynaptic density (PSD) and genes associated with neurological phenotypes in animal models. Genome-wide significant evidence was obtained for CNVs at 2p16.3, 3q29, 16p11.2, 15q13.3, 22q11.2 and additional novel loci. Furthermore, the centralized CNV calling pipeline enables fine-scale delineation of select loci to the level of sinle genes. Our findings suggest that analysis of CNV in large GWAS datasets can advance our knowledge of rare genetic variants that contribute to risk for schizophrenia.
You may contact the first author (during and after the meeting) at