Whole-Genome Sequencing Study of ~6,000 Samples for Age-related Macular Degeneration. A. Kwong1,2, X. Zhan1,2, L. G. Fritsche1,2, J. Bragg-Gresham1,2, K. E. Branham3, M. Othman3, A. Boleda6, L. Gieser6, R. Ratnapriya6, D. Stambolian4, E. Y. Chew5, A. Swaroop6, G. Abecasis1,2 1) Department of Biostatistics, University of Michigan, Ann Arbor, MI; 2) Center for Statistical Genetics, University of Michigan, Ann Arbor, MI; 3) Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, MI; 4) Department of Ophthalmology and Human Genetics, University of Pennsylvania Medical School, Philadelphia, PA; 5) Division of Epidemiology and Clinical Applications, National Eye Institute/National Institutes of Health, Bethesda, MD; 6) Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute/National Institutes of Health, Bethesda, MD.
Purpose: Age-related Macular Degeneration (AMD) is a leading cause of blindness among the elderly. Over the past several years, genetic studies of common variation have provided many clues about disease biology. Due to assay limitations, these studies have typically either ignored rare variants or examined them only in a small set of candidate regions. Here, we set out to systematically study the contribution of rare variants to disease. Methods: We assembled a collection of ~3,000 cases and ~3,000 controls with advanced AMD (67% neovascularization, 33% geographic atrophy) from the Kellogg Eye Center at University of Michigan, Age-Related Eye Disease Study from the NEI, and the University of Pennsylvania. We matched cases and controls according to age, gender, and ethnicity. The large number of samples to be processed presented a computational challenge. Our data will enable a systematic assessment of coding and non-coding variation in previously associated loci as well as a genomewide search for new risk alleles. Results: As of today, ~3,000 samples have been sequenced, representing >55 Terabytes (5.5 x 1013 bytes) of sequence data. This corresponds to a total genomic coverage of ~18,000x and an average coverage of ~6x per sample. In an initial analysis of a subset of the data, we discovered and genotyped ~31 million variants. The set includes several previously-studied rare AMD risk variants that were found in complement genes (such as CFH:p.R1210C, CFI:p.G119R, C9:p.P167S, and C3:p:K155Q), but also many new functionally-interesting variants, such as 20 missense and 1 nonsense mutations in the CFH gene that are very rare (median minor allele frequency = 0.05%) and missing from previous studies. Among the variants in the current dataset, we found 172,971 non-synonymous SNPs and 7,601 loss-of-function SNPs. Conclusions: We provide a first detailed look at the genetics of AMD through whole-genome sequencing of ~6,000 individuals. Our data will enable a systematic genomewide search for rare risk alleles and should allow us to evaluate the effect of nonsense variants in many previously associated genes.
You may contact the first author (during and after the meeting) at