HapFABIA: Identification of very short segments of identity by descent (IBD) via biclustering. S. Hochreiter, G. Povysil Institute of Bioinformatics, Johannes Kepler University Linz, Linz, Austria.
Identity by descent (IBD) can be detected reliably for long shared DNA segments which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to utilize rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing, and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies a biclustering technique to identify very short IBD segments characterized by rare variants. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified t short IBD segments characterized by rare variants with a median length of 25 kbp in data for chromosome 1 from the 1000 Genomes Project. IBD segments that match the Denisovan or the Neandertal genomes (archaic genomes) are either shared by a very low or a very high proportion of Africans. IBD segments that match archaic genomes are enriched at lengths in the ranges of 0 to 12 kbp (about 130 kyr in the past) and 38 to 60 kbp (13 - 20 kyr). IBD segments that match an archaic genome and are of length 0 - 12 kbp are overrepresented in Africans, while those of length 38 - 60 kbp are mainly found in Asians or Europeans. Both the distributions of proportions as well as the IBD segment lengths hint to two events: (1) an admixture of humans and archaic genomes outside of Africa and (2) an admixture of humans and archaic genomes within Africa or survival of ancient DNA segments in the African population.
You may contact the first author (during and after the meeting) at