A genome-wide non-synonymous SNP panel. L.M. Galver1, P.C. Ng1, S. Hunt2, J. Whitacre1, R. Shen1, P. Deloukas2, S.S. Murray1. 1) Illumina, Inc, San Diego, CA; 2) Wellcome Trust Sanger Institute, Hinxton, United Kingdom.
Approximately 90% of all genetic variation in humans is attributed to single nucleotide polymorphisms (SNPs) (Collins et al., 1998). Many SNPs are believed to cause phenotypic differences and can be related to an individuals susceptibility to disease. SNPs in coding regions that cause amino acid changes, non-synonymous SNPs (nsSNPs), as well as SNPs in regulatory regions, are believed to have the highest impact on phenotype as they directly affect protein structure and function (Collins et al, 1997). We have developed a genome-wide panel of nsSNPs that will be a valuable resource either as a whole genome screen of nsSNPs only, or as a complement to other whole genome or candidate gene association studies.
SNPs were selected for this study by screening public databases for all annotated nsSNPs. Approximately 50,000 SNPs were screened for assay designability on the Illumina BeadArray platform and approximately 38,000 were screened for functionality on 90 CEU (Caucasian) samples from the HapMap project. Greater than 13,000 functional SNP assays were chosen for the final panel that represents greater than 5,800 genes. The final assay panel was designed utilizing a whole genome genotyping assay on Illuminas multi-sample bead chips and 270 HapMap samples were genotyped for validation.
Preliminary results from the analysis of 270 HapMap samples show an average minor allele frequency (MAF) of 0.18 for the CEU population, 0.16 for CHB/JPT and 0.16 for YRI. We observed a significant percentage of nsSNPs monomorphic in a single population were polymorphic in one of the other two populations indicating population level differences and possibly selection. Comparing the dN/dS ratio(Nielsen, et al. 2005) to the no. of nsSNPs per gene, we observed that highly conserved genes have fewer nsSNPs. We also confirmed that putative-damaging nsSNPs (predicted by SIFT, Ng and Henikoff, 2002) tended to be present at lower MAFs as has been predicted in previous studies (Livingston et al., 2004, Leabman et al., 2003).