Program Nr: 74 for the 2006 ASHG Annual Meeting

Mapping trait loci using inferred Ancestral Recombination Graphs. M.J. Minichiello, R. Durbin. Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
   Large-scale association studies are being undertaken with the hope of uncovering the genetic determinants of complex disease. We will describe a computationally efficient method for inferring genealogies from population genotype data, and will show how these can be used to fine map disease loci and dissect association signals.
    These genealogies take the form of the Ancestral Recombination Graph (ARG). The ARG defines a genealogical tree for each locus, and as one moves along the chromosome, the topologies of consecutive trees shift according to the impact of historical recombination events. There are two stages to our analysis. First, we infer plausible ARGs using a heuristic algorithm, which can handle unphased and missing data, and is fast enough to be applied to large-scale studies involving thousands of individuals. Second, we test the genealogical tree at each locus for a clustering of the disease cases beneath a branch, thus determining whether a causative mutation occurred on that branch. Since the true ARG is unknown, we average this analysis over an ensemble of inferred ARGs.
    We have characterised the performance of our method across a wide range of simulated disease models. Compared to single marker and haplotype based tests, our method gives increased power and accuracy in positioning untyped causative loci. It can also be used to estimate the frequencies of untyped causative alleles and the haplotypic background on which causative mutations occurred. We have applied our method to Ueda et al.'s association study of CTLA4 and Graves' disease, showing how it can be used to dissect the association signal, giving interesting results suggesting allelic heterogeneity and epistasis.
    Similar approaches analysing an ensemble of ARGs inferred using our method may be applicable to many other problems of inference from population genotype data, such as detecting population substructure and selection.