Addressing the complexity of cancer: integrative genomic and transcriptomic analysis of 775 human cancer cell line reveals novel drivers and regulatory programs. A. C. Villani1, C. Ye1, A. Regev1,2,3, N. Hacohen1,4,5 1) Genome Biology & Cell Circuits, Broad Institute of MIT and Harvard, Cambridge, MA; 2) Howard Hughes Medical Institute, Chevy Chase, MD; 3) Biology Dept, Massachusetts Institute of Technology, Cambridge, MA; 4) Harvard Medical School, Boston, MA; 5) Center for Immunology and Inflammatory Disease, Massachusetts General Hospital, Charlestown, MA.

   Cancer is a genetically driven disease, and human cancer cell lines carrying such genetic drivers are a great model system for performing mechanistic studies of autonomous tumor-driven pathways in homogeneous controlled experimental settings. Any putative driver commonly altered in several cancer cell lines from different tissues of origin may point towards novel tumorigenesis mechanisms. We postulate that large-scale integrative genomics analysis of cancer cell lines repository could lead to the identification of novel master regulators of autonomous programs as well as key cell lines for follow-up functional studies. We performed an unbiased integrative analysis of chromosomal copy number aberrations (CNAs) and gene expression data (Affymetrix U133+2.0) of 775 human cancer cell lines, derived from 23 different tissues of origin. Using modified versions of GISTIC2.0 and CONEXIC, a Bayesian module network method, we sought to identify novel cancer driver genes that commonly alter transcriptional regulatory programs across multiple cell lines. To account for the inherent heterogeneity in expression data, we corrected for known (histology, sub-histology, ethnicity, gender, center of collection, experimental batches) and unknown confounding variables using Surrogate Variable Analysis (SVA). We identified a total of 60 amplified and 222 deleted somatic regions, of which 52% contained a candidate driver gene whose expression was associated with the expression of a target module. The driver genes consisted of putative novel and known cancer drivers, including deletions in CDKN2A-CDKN2B (q 4.2E-176), WWOX (q 5.3E-100), and amplifications in MYC (q 2.1E-81) and CCND1 (q 1.4E-44) genes. The target modules were enriched for known cancer dis-regulated processes, including cell cycle (q 2.1E-91) and DNA damage/repair (q 9.3E-50) associated programs. Interestingly, we also identified novel immune (q 9.8E-26) and metabolic (q 1.5E-17) cancer-autonomous programs common to several tissues of origin. These latter programs may be key to driving polarization of the tumor microenvironment. Several cell lines were nominated for ongoing follow-up functional studies to validate these predictions. Our results provide a powerful framework to identify putative novel drivers, nominate cell lines for follow-up functional studies, and highlight altered pathways common to several cancer models with biological, and possibly therapeutic, importance in cancer.

You may contact the first author (during and after the meeting) at