Comprehensive Blood Group Prediction Using Whole Genome Sequencing Data from The MedSeq Project. W. J. Lane1,2,3, I. Leshchiner4, S. Boehler1, J. M. Uy1, M. Aguad1, R. Smeland-Wagman1, R. C. Green3,6, H. L. Rehm1,3,5, R. M. Kaufman1, L. E. Silberstein7 for The MedSeq Project 1) Department of Pathology, Brigham and Womens Hospital, Boston, MA; 2) Harvard Medical School Transfusion Medicine Fellow, Boston, MA; 3) Harvard Medical School, Boston, MA; 4) Genetics Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; 5) Laboratory for Molecular Medicine, Partners Healthcare Center for Personalized Genetic Medicine, Cambridge, MA; 6) Department of Medicine, Brigham and Womens Hospital, Boston, MA; 7) Division of Transfusion Medicine, Department of Laboratory Medicine, Childrens Hospital Boston, Boston, MA.

   There are 339 phenotypically distinct red blood cell (RBC) blood group antigens. For 298 of these antigens, the molecular basis is known, comprising 48 genes and 1,100 alleles distributed across 34 blood group systems. Exposure to non-self RBC antigens during transfusion or pregnancy can lead to the development of alloantibodies, which on re-exposure can lead to clinically significant and even fatal complications. Therefore, it is vital to know which antigens are present on RBCs. However, traditional serologic phenotyping methods are labor intensive, costly, sometimes unreliable, and reagents are not always available. As such, routine antigen typing is only done for ABO and D antigens. A large percentage of blood is given for hematologic malignancies that will soon get routine whole genome sequencing (WGS). For a minor added cost this data could be used for RBC antigen prediction. However, there are no published reports of using WGS data to predict RBC antigens. This is likely for several reasons: (1) none of the existing WGS data sets have paired serologic RBC phenotypes, (2) there are no fully annotated and complete databases of genotypes to phenotypes, (3) all of the known alleles are defined using cDNAs numbered relative to the start codon without human genome coordinates, and (4) lack of software capable of RBC antigen prediction. We have created a fully interactive web site of all known blood group genotype to phenotype correlations, fully annotated with relevant information, and mapped to and visually overlaid to their corresponding human reference genome gene sequences, with algorithms to predict antigen phenotypes from inputted sequences. These predictions are part of the General Genome Reports for the 100 patients getting WGS as part of The MedSeq Project. We are also interpreting the antigen patterns to identify patients at risk of making difficult-to-match alloantibodies, potential rare donors, and those with blood group-associated resistance to malaria and norovirus. In addition, each patient is undergoing an extensive antigen phenotypic workup using traditional blood bank serology, which is being used to validate and improve our prediction strategies. As clinical WGS becomes pervasive we hope that comprehensive blood group prediction will be done on everyone, allowing for easy identification of rare donors and the prevention of alloantibody formation using extended upfront matching of antigens from sequenced recipients and donor.

You may contact the first author (during and after the meeting) at