Transcriptome sequence analysis of human colorectal cancer samples reveals cancer functional attributes. H. Ongen1,2,3, T. F. Orntoft4, J. B. Bramsen4, B. Oster4, L. Romano1,2,3, A. Planchon1,2,3, C. L. Andersen4, E. T. Dermitzakis1,2,3 1) Department of Genetic Medicine and Development, University of Geneva, Geneva, Switzerland; 2) SIB, Swiss Institute of Bioinformatics, Lausanne, Switzerland; 3) iGE3, Institute of Genetics and Genomics in Geneva, Geneva, Switzerland; 4) Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark.

   In developed countries colorectal cancer (CRC) is the second leading cause of cancer death with a million new cases worldwide yearly and a mortality rate of ~50%. Among cancers, the incidence of CRC ranks fourth in men and third in women and will grow with the increase of westernized lifestyles. Here we will report the RNA-seq analysis of matched tumor and normal mucosa from 103 CRC patients and 20 other normal tissues from unrelated healthy donors. The normal tissues are used to assess how CRC differentiates not only from its normal counterpart but also from other normal tissue types. RNA was purified from microdissected samples (tumour percentage 60%) and sequenced on the Illumina HiSeq (34-80 million reads). Furthermore 90 of the CRC patients germline genomes are genotyped. We observe 1626 differentially expressed genes (FDR = 5%, fold change 2) between normal colon and cancer. Tumors, normal colons, and other tissue samples form three distinct clusters based on gene expression, and tumors cluster in between the other two, suggesting that cancer increases the variance of the transcriptome but in a predictable manner. On average there are 688 significant allele-specific expression (ASE) signals (FDR = 1%) per sample. The proportion of heterozygous sites that have an ASE effect is significantly more in tumors. About 34% of the ASE sites are tumor specific and ~10% of shared ASE sites between normal and tumor exhibit reversal of the effect suggestive of loss of heterozygosity in these genes. Unlike gene expression when ASE is considered, genetic effects on gene expression are more similar between tumors and their match controls than between tumors and other tumors. We identify multiple regions on nearly all chromosomes where the correlation of expression for proximal genes is significantly increased in the tumors when compared to normals. There are 1693 and 948 eQTLs (permutation P < 0.01) in normal and tumors respectively. Approximately 60% of the eQTLs are shared between normal and tumor, with tumor specific eQTLs numbering over 300. We are investigating whether the tumor specific eQTL genes accumulate somatic mutations making them likely candidates for being driver genes. Preliminary analysis indicates that there are 124 gene fusions in the tumor population studied although none are recurrent. Altogether these results will greatly benefit our understanding of colorectal tumorigenesis.