High-throughput Determination of Long INterspersed Element-1 Integration Preferences in the Human Genome. D. A. Flasch1, A. Macia5, T. Widmann5, J. L. García-Pérez5, T. E. Wilson1,2, J. V. Moran1,3,4 1) Department of Human Genetics; 2) Department of Pathology; 3) Department of Internal Medicine, 1241 E. Catherine Street, University of Michigan Medical School, Ann Arbor, Michigan 48109-5618, USA; 4) Howard Hughes Medical Institute; 5) GENYO (Centre for Genomics and Oncological Research), Granada, Spain.
Long INterspersed Element-1 (LINE-1 or L1) retrotransposon-derived sequences comprise ~17% of the human genome reference sequence (HGR). However, since the majority of L1 retrotransposition events occurred millions of years ago, Darwinian selective pressures have skewed their initial genomic distributions. Thus, new and unbiased assessments are needed to accurately survey L1 integration preferences.
Here, we have exploited engineered L1s to generate de novo L1 retrotransposition events in various human cell lines. We used PCR-based strategies to specifically amplify the 3 ends of engineered human L1 retrotransposition events and their associated flanking genomic DNA sequences. The resultant amplicons then were sequenced using the Pacific Biosciences circular consensus DNA sequencing platform and passed through a bioinformatics pipeline to call integration sites at single nucleotide resolution with high accuracy and sensitivity. To date, we have characterized ~23,000 L1 insertions in HeLa cells, ~30,000 insertions in ovarian carcinoma cells, ~500 insertions in human embryonic stem cells (hESCs), and ~900 insertions in hESC-derived neural progenitor cells. This large data set should provide the statistical power to determine if L1 preferentially integrates into specific genomic regions and whether L1 integration preferences differ between cell types.
Our preliminary data revealed that, depending upon the observed cell type, approximately 22%-38% of the engineered L1 insertions resided within introns. Approximately 40-45% of these insertions occurred within the largest intron of the gene. Collectively, we discovered over 700 L1 integration events into the 5 untranslated region (UTR), coding exon, or 3UTR of genes; such insertions are relatively infrequent in the HGR. These two observations suggest that euchromatic regions of the genome are accessible and susceptible to de novo L1 integration events. We currently are exploring whether other features (e.g., DNaseI hypersensitive sites, acetylation and methylation sites, replication origins, transcription start sites, intergenic regions, etc.) render genomic DNA vulnerable to L1 integration. In sum, our strategy allows an accurate assessment of L1 integration site preferences before being blurred by selective pressures that occur over evolutionary time.
You may contact the first author (during and after the meeting) at