A chromosome imbalance map of the human genome. M. Zarrei1, J. R. MacDonald1, R. Ziman1, G. Pellecchia1, D. J. Stavropoulos2, D. Merico1, S. W. Scherer1,3 1) The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, On, Canada; 2) Department of Pediatric Laboratory Medicine, Cytogenetics Laboratory, The Hospital for Sick Children, Toronto, On, Canada; 3) McLaughlin Centre, University of Toronto, Toronto, On, Canada.

   Copy number differences between genomes are a major type of genetic variability. The chromosome imbalance map identifies genome regions that are prone to copy number variation (CNV) in apparently healthy individuals. We defined stringent quality and resolution requirements to select a gold standard subset of copy number variation studies from the Database of Genomic Variants. These variants, separately for deletions and duplications, were then clustered based on 50% reciprocal overlap to identify copy number variable regions (CNVRs) that were defined with the outmost coordinates of each cluster. Two chromosome imbalance maps were constructed, using different stringency levels. The inclusive map includes CNVRs supported by a minimum of two subjects, i.e. excluding all singleton variants, whereas the stringent map is composed of the regions that are supported by a minimum of two subjects and called in a minimum of two different studies. Approximately 9.5% of the human genome is variable according to the inclusive map, while 4.8% is variable according to the stringent map. The pericentric and subtelomeric regions of chromosomes show a particularly high rate of variation, and variability is correlated with presence of segmental duplications. We assessed the copy number variability of a comprehensive set of genomic features, with particular attention to exonic gene sequence. We found that the exons of all RefSeq genes are more variable compared to the entire genome for both gains and losses. However, higher copy number stability is observed for genes that are essential, or causally implicated in human disease (Mendelian disorders, cancer) or under negative selection for nonsynonymous variation. Exons of non-coding genes display greater variability, although highly conserved lincRNAs display a higher degree of stability. The enhancers and ultra-conserved elements are more stable than the genome whereas the proximal promoter regions are more variable than the genome. Functional category analysis revealed an enrichment in stable genes for macromolecular complexes (such as the proteasome), as well as pathways regulating cell cycle and organ development whereas olfactory receptors, xenobiotic metabolism and certain immune receptor families were found to be enriched in variable regions. Our chromosome imbalance map can be used as an effective tool for identifying variants within copy number variable regions from patient data in the clinical settings.

You may contact the first author (during and after the meeting) at