MCMC provides practical linkage analysis on general pedigrees with many STRs or dense SNPs. E. Wijsman1,2, J. Rothstein2, E. Thompson3. 1) Dept Medical Genetics; 2) Dept. Biostatistics; 3) Dept. Statistics; Univ Washington, Seattle, WA.
Linkage analysis needs to adapt to full-chromosome multipoint linkage analysis with either SNPs or STRs. While exact computational tools are available for use with small pedigrees, equivalent exact computation for large pedigrees remains intractable. Markov chain Monte Carlo (MCMC) based methods currently provide the only computationally practical option. No systematic comparison of performance of MCMC-based programs is available, nor have these programs been systematically evaluated for use with dense SNPs. We used simulated data to evaluate performance of two MCMC-based linkage analysis programs: lm_markers (LMM) from the MORGAN package and SimWalk2 (SW). Pedigrees consisted of 14, 52, or 98 individuals in 3-6 generations with up to 4 generations of missing data in the largest pedigree. 100 replicates of markers and trait data were simulated on a 100-cM chromosome, with up to 10 STRs and up to 200 SNPs used simultaneously for computation of multipoint LOD scores. Exact computation was available for comparison in most situations, and comparison with a perfectly informative marker or inter-program comparison otherwise. Both programs were fast and accurate with STRs. For example, computation on the 52-member pedigree (PED 52) for 3 markers required 2-2.5 min. with LMM, SW, or Vitesse; the median discrepancy, d, relative to exact LOD scores was 10%;, and even for 10 markers, computation times for the MCMC programs increased to only 5.7 and 14 CPU min., respectively. In contrast, for large numbers of dense SNPs only LMM was able to provide accurate results in computationally practical time. For PED52 and an analysis of 67 dense SNPs, LMM required only ~11 CPU min/pedigree with a median d=19%, compared to ~11 CPU hrs/pedigree for SimWalk2 yielding a median d=62%. Similar results were obtained for the smallest pedigree, with LMM requiring 2 min to achieve a median d=10% of the truth, while SW required 66 min to achieve a median d=20% of the truth. Thus the MORGAN package provides a computationally practical option for accurate linkage analyses in genome scans with both large numbers of SNPs and large pedigrees. Supported by NIH GM46255.