Greater morbidity and mortality rates of Covid-19 in minority populations have highlighted continuing health disparities in the US and UK (1,2). Principles and values have been promoted to guide genomic research, recognising as an ethical imperative, that advances should benefit all populations equitably (3-5). Genomics research was initiated by high income countries using samples mainly from individuals of European ancestry. Approximately 16% of the world’s population is accounted for by individuals of European ancestry, but they disproportionately comprise approximately 80% of all genomic study participants (6,7). This imbalance could exacerbate existing health disparities among diverse populations.
Large-scale genome-wide association studies (GWAS), mainly using data from populations of European ancestry, enabled the identification of genetic variants for many common diseases (8,9). So far, over 55,000 unique genetic regions associated with approximately 5000 diseases/traits have been identified (10). These studies have also led to fine-mapping of potentially causal variants, with insights into mechanisms of diseases such as type 2 diabetes (11).
However, it is becoming increasingly apparent that many of these findings may not be useful for populations of non-European ancestry. As early as 2016, misclassifications of variants were found in diverse populations (12). In a study of patients undergoing genetic testing for hypertrophic cardiomyopathy (HCM), benign genetic variants, previously categorised as pathogenic for HCM in individuals of European ancestry, were found in healthy individuals, all of whom were of African or unspecified ancestry (12). The study also determined that the inclusion of small numbers of African Americans in control groups would have prevented these misdiagnoses.
Most common diseases are thought to involve many genes. Polygenic risk scores (PRSs) were developed based on GWAS data in order to identify individuals at increased risk for a range of diseases (13-15) These approaches are starting to allow the identification of individuals who could benefit from preventative approaches (16). However, in a recent analysis of the usage and performance of studies that generated PRSs, 67% of the studies used samples exclusively from populations of European ancestry (17). The predictive performances of the PRSs were significantly lower in non-European populations, underlining the need to include diverse populations in GWAS in order to improve the performance of PRSs in those populations.
In contrast, genetic studies of individuals of African ancestry, as early as 2005, have been shown to be beneficial for many populations (18,19). Novel variants of proprotein convertase subtilisin kexin 9 (PCSK9), found in individuals of African ancestry, were shown to reduce levels of low-density lipoproteins, which are known to increase risk of coronary heart disease (CHD)(18). This allowed the development of drugs to inhibit PCSK9 function (19). The benefits of these drugs are not limited to African populations as they are effective for populations worldwide.
The diverse human populations all have a common African ancestry (9,19). The 1000 genomes project (20), in which whole genomes were sequenced from 2,504 individuals from 26 populations worldwide, confirmed that African populations have the greatest genetic diversity (6,9). Human genomic variation occurred during evolution into population groups separated by ancestry, geography and other environmental factors (9). A recent deep-sequencing study of only 910 individuals of African ancestry found that the resulting pangenome contained approximately 10% more DNA (around 300 million nucleotides) than the current human reference genome (21). Another recent study sequencing 929 genomes from 54 populations worldwide discovered millions of novel variants (22). Thus, the paucity of non-European data used in genomic research greatly restricts the understanding of relationships between disease and genetics and the discovery of novel correlations.
Overall, recent studies suggest that genetic variants found in one population should not be assumed to be present in others (23). In order for GWAS to be more useful for non-European populations, inclusion of these populations is needed. Similarly, generation of improved PRSs will require the use of data from populations of similar ancestry. Ongoing efforts to sequence genomes of diverse populations are yielding a more comprehensive understanding of human genomic variation and the discovery of additional novel variants (20-22). The Covid-19 pandemic has highlighted that historically underrepresented populations are typically less trusting of health and research systems and therefore less willing to participate in medical studies (3,4). Therefore, as well as including diverse populations in genomic research, a more diverse workforce is needed to encourage involvement of minority populations (3,4). Further education of professionals and the general public about the phenotypic characteristics of disease in diverse populations will also ultimately lead to improved disease diagnosis, genotype-phenotype correlations and medical interventions.
1. Evans, Michele K. “Health Equity – Are We Finally on the Edge of a New Frontier?.” The New England journal of medicine vol. 383,11 (2020): 997-999
2. Johnson-Mann, Crystal et al. “COVID-19 pandemic highlights racial health inequities.” The lancet. Diabetes & endocrinology vol. 8,8 (2020): 663-664
3. Green, Eric D et al. “Strategic vision for improving human health at The Forefront of Genomics.” Nature vol. 586,7831 (2020): 683-692
4. Her Majesty’s Government. “Genome UK: the future of healthcare.” (2020) https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/920378/Genome_UK_-_the_future_of_healthcare.pdf
5. Her Majesty’s Government. “Guidance: The NHS Constitution for England.” (2021) https://www.gov.uk/government/publications/the-nhs-constitution-for-england/the-nhs-constitution-for-england
6. “Genetics for all.” Nat Genet, vol. 51, 579 (2019)
7. Popejoy, Alice B, and Stephanie M Fullerton. “Genomics is failing on diversity.” Nature, vol. 538,7624 (2016): 161-164
8. Rosenberg, Noah A et al. “Genome-wide association studies in diverse populations.” Nature reviews. Genetics, vol. 11,5 (2010): 356-66
9. Rotimi, Charles N, and Adebowale A Adeyemo. “From one human genome to a complex tapestry of ancestry.” Nature, vol. 590,7845 (2021): 220-221
10. Loos, Ruth J F. “15 years of genome-wide association studies and no signs of slowing down.” Nature communications, vol. 11,1 5900, (2020)
11. Mahajan, Anubha et al. “Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps.” Nature genetics, vol. 50,11 (2018): 1505-1513
12. Manrai, Arjun K et al. “Genetic Misdiagnoses and the Potential for Health Disparities.” The New England journal of medicine, vol. 375,7 (2016): 655-65
13. Khera, Amit V et al. “Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations.” Nature genetics, vol. 50,9 (2018): 1219-1224
14. Mavaddat, Nasim et al. “Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes.” American journal of human genetics, vol. 104,1 (2019): 21-34.
15. Musliner, Katherine L et al. “Association of Polygenic Liabilities for Major Depression, Bipolar Disorder, and Schizophrenia With Risk for Depression in the Danish Population.” JAMA psychiatry, vol. 76,5 (2019): 516-525
16. Aragam, Krishna G et al. “Limitations of Contemporary Guidelines for Managing Patients at High Genetic Risk of Coronary Artery Disease.” Journal of the American College of Cardiology, vol. 75,22 (2020): 2769-2780
17. Duncan, L et al. “Analysis of polygenic risk score usage and performance in diverse human populations.” Nature communications, vol. 10,1 3328 (2019)
18. Cohen, Jonathan et al. “Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9.” Nature genetics, vol. 37,2 (2005): 161-5
19. Bentley, Amy R et al. “Evaluating the promise of inclusion of African ancestry populations in genomics.” NPJ genomic medicine, vol. 5 5 (2020)
20. 1000 Genomes Project Consortium et al. “A global reference for human genetic variation.” Nature, vol. 526,7571 (2015): 68-74
21. Sherman, Rachel M et al. “Assembly of a pan-genome from deep sequencing of 910 humans of African descent.” Nature genetics, vol. 51,1 (2019): 30-35
22. Bergström, Anders et al. “Insights into human genetic variation and population history from 929 diverse genomes.” Science (New York, N.Y.) vol. 367,6484 (2020)
23. Kuchenbaecker, Karoline et al. “The transferability of lipid loci across African, Asian and European cohorts.” Nature communications vol. 10,1 4330 (2019)
Genomic Research on Diverse Populations Promotes Human Health
From each fingerprint that is uniquely individual, to many diseases that so devastatingly grip human lives, DNA is the genetic blueprint that codes for these human conditions. Nearly two decades ago, when the Human Genome Project came to an end, it could be said that this blueprint for human life was completed (1). While this accomplishment should not be diminished, the task to seek human genetic makeup must continue as the genetic variation in different ethnic groups have not yet been studied thoroughly. Both large-scale international and smaller-scale regional studies have predominantly focused on people of European ancestry. These stark racial disparities unveil a painful truth that the understanding of genetics-related healthcare is incomplete and flawed. Inclusion of more diverse populations in genomic research will promote more accurate genetic risk assessment, improve diagnoses, and deliver better catered treatments.
The lack of diversity in genomic research makes it difficult to bring genetic risk assessment into clinical stages. In recent years, it has been discovered that a Polygenic Risk Score (PRS), a measurement used to calculate the likelihood a disease will affect an individual, differs among ethnicities (2). Specifically, the UK Biobank has found that PRS is especially effective for predicting individuals who are more susceptible to cardiovascular diseases, type 2 diabetes, certain cancers, and Alzheimer’s disease (3). It is possible to dream of a future when patients can be given preventative treatment and offered lifestyle changes prior to being impacted by a disease they are at risk for. However, genetic variants for those of European ancestry differ from those of non-European ancestry. It is impossible to accurately apply PRS for diverse populations without conducting well-designed research to ensure equality in prevention and opportunities (4). With the promising effects PRS can have, new methods such as Multiethnic PRS, are currently being tested by combining old data from European ancestry and newer data from more diverse populations to better predict diseases (5).
The expression and behavior of diseases also differ among ethnicities. With less data on non-European individuals, identifying diseases can be inaccurate for minorities. For example, individuals of African ancestry are more likely to have hypertrophic cardiomyopathy (6). Since the symptoms of this disease vary greatly, identifying a pathogenic variant is incredibly important for diagnoses. Yet, the variant that can be used to identify hypertrophic cardiomyopathy in those of African ancestry, is extremely rare among white populations, on whom the studies were first conducted (6). In the beginning, this resulted in individuals of African ancestry who had a high amount of this variant being classified as benign, leading to many incidences of misdiagnoses (6). Therefore, it is crucial to perform genomic research among diverse populations so that all patients can receive an equitable chance for accurate diagnoses.
Additionally, every person’s unique genetic makeup plays a factor in treatment and how the patient responds to it. This information can often be used to guide clinical decisions through a specific approach called precision medicine (7). It can only be effective if the patient can be compared to a large database, which is unavailable for non-European individuals, only furthering racial disparities (7). For instance, it was realized that a widely used oral anticoagulant, called Warfarin, had the wrong dosages for individuals of African descent (8). VKORC1 and CYP2C9 genes were found to contribute to the variability of doses when genotype-guided dose algorithms were performed with a severely under-represented group of African Americans (8). Only later was CYP2C SNP found to be the gene that had a relevant effect on Warfarin dosage on African Americans, which led to a new dosage recommendation for those with this varied genotype (8).
The importance of diversity in genetic research becomes increasingly obvious. As projects such as the Genome-Wide Association Studies (GWAS) and the Population Architecture Using Genomics and Epidemiology (PAGE) continue to expand the genetic information about those of non-European ancestry, prevention, diagnosis and treatment for diseases will become more accurate and efficient (9). Although data from European ancestry can be relied on while more diverse studies are being performed, it is crucial that this information is re-evaluated with that from minority groups. Excitingly, medicine is taking a new step forward towards a future where all ethnicities can be equally accounted for, as new important genetic information is being found each day through diversifying studies.
1. National Human Genome Research Institute. (2021). The Human Genome Project. Genome.gov. https://www.genome.gov/human-genome-project.
2. Duncan, L., Shen, H., Gelaye, B. et al. (2019). Analysis of polygenic risk score usage and performance in diverse human populations. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-11112-0
3. Lewis, C. M., Vassos, E. (2020). Polygenic risk scores: from research tools to clinical instruments. Genome Medicine, 12(1). https://doi.org/10.1186/s13073-020-00742-5
4. Gurdasani, D., Barroso, I., Zeggini, E. et al. (2019). Genomics of disease risk in globally diverse populations. Nature Reviews Genetics, 20(9), 520-535. https://doi.org/10.1038/s41576-019-0144-0
5. Márquez-Luna, C., Loh, P.-R., Price, A. L. (2017). Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genetic Epidemiology, 41(8), 811-823. https://doi.org/10.1002/gepi.22083
6. Bentley, A. R., Callier, S., Rotimi, C. N. (2017). Diversity and inclusion in genomic research: why the uneven progress? Journal of Community Genetics, 8(4), 255-266. https://doi.org/10.1007/s12687-017-0316-6
7. Wojcik, G.L., Graff, M., Nishimura, K.K. et al. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature, 570(7762), 514–518 https://doi.org/10.1038/s41586-019-1310-4
8. Dotzert, M. (2020). Why greater diversity is needed in genomic research? Clinical Lab Manager. https://www.clinicallabmanager.com/trends/genomics/why-greater-diversity-is-needed-in-genomic-research-23704
9. Sirugo, G., Williams, S. M., Tishkoff, S. A. (2019). The missing diversity in human genetic studies. Cell, 177(4), 1080. https://doi.org/10.1016/j.cell.2019.04.032
The Future of Genomic Research: An Ethnically Inclusive Approach
Tens of thousands of years ago, part of the early human race migrated out of Africa and spread into other areas of the world. Genetic variants began to develop between populations in different geographic locations and humans diverged into various ethnic groups (8). Today, while all humans are almost genetically identical, an important 0.1% of DNA differs between individuals, accounting for our unique personal traits and ancestry (13). In recent years, medical research has taken an increasingly genetically-focused approach to make groundbreaking advances. About 80% of the genomic data collected for such research are from individuals of European ancestry. Sourcing genomic data from more diverse populations, however, could yield greater advances in genetic testing, treatments, and research.
More Accurate Genetic Testing
The use of a diverse pool of genomic data could result in better genetic testing, with a broader range of tests to more accurately predict and confirm diseases in certain populations. Although ethnic groups do not have a fixed set of genes specific to their population, genetic variants and their level of association with disease risk sometimes vary between ethnic groups (4). In a series of case-control studies, researchers found that a genetic variant (the APOE ε4 allele) was strongly associated with Alzheimer’s disease (AD) in those of European ancestry (1). The association of this same variant to AD, however, was nearly absent among those of African ancestry (1). Instead, there were other AD-associated variants—such as within the GPC6 and VRK3 genes—found to be present in individuals of African ancestry (3). Thus, genomic findings from predominantly one ethnic group cannot always be extrapolated to other groups, as genetic testing may inaccurately predict or confirm disease—or fail to do so altogether.
More Effective Treatments
Ethnically-diversified groups in genomic studies could also further develop precision medicine. This is an approach that takes the variability in genes into account, especially when the response to drugs differs between populations (12). For instance, among individuals with hypertension, those of European ancestry tend to have a better response to β-blocker drugs than those of African ancestry (9). This is because those of African ancestry are about twice as common to have polymorphisms in the gene coding for the β1-adrenergic receptor, resulting in their decreased response to β-blockers (10). With this knowledge, those of African ancestry are typically given other drugs to treat hypertension, such as diuretics (11). While individuals should not be treated based primarily on race—as drug-response is also affected by lifestyle, environment, and age—the use of genomic data from more diverse populations could provide pharmacogenetics with additional considerations to cater more effective treatments to different populations (6).
More Comprehensive and Rewarding Research
The inclusion of more diverse populations in genomic studies would not only serve individual groups, but also advance overall genetic research. For example, mutations in PCSK9 genes that lower LDL cholesterol were discovered in research participants of African ancestry (5). These mutations are rare in those of European ancestry and would be less likely to have been discovered without the research on those of African ancestry (5). As a result of this research, a breakthrough class of drugs, PCSK9 inhibitors, were developed to treat high cholesterol and can be administered across populations, regardless of ethnicity (7). Hence, only by exploring genomic data from diverse populations can researchers uncover the full potential of the human genome and make more comprehensive discoveries that advance healthcare for all.
Conclusion: Challenges and Opportunities
A better approach to ensure the representation of ethnic groups in genomic studies is to gather data, population by population. Researchers could conduct the same study per ancestry, such as for Ethiopians, Italians, Colombians, Indonesians, Native-Americans, and so on, then compare results between groups. Although this seems to be a simple modification, reluctance to adopt this approach has likely been due to the increased cost and time to repeat research for different groups, the majority-white populations readily accessible by the main countries funding genetic research, and possible inaccuracies with self-reported ancestry, especially with individuals of mixed-ethnicity (2). Furthermore, genomic data from those of European ancestry are still generally useful for other populations.
Nonetheless, the potential benefits of ethnically diversified data-gathering far outweigh these considerations and could transcend the limitations of unintended ethnic bias in genomic research, eliminate health disparities, and further human advancement for all.
1. Anderson, N. B. (2004). Ethnic Differences in Dementia and Alzheimer’s Disease. Retrieved from https://www.ncbi.nlm.nih.gov/books/NBK25535/
2. Best Regions for Genetics Research: INN. (2016, November 04). Retrieved from https://investingnews.com/daily/life-science-investing/genetics-investing/genetics-investing-regions-stocks/
3. Brian W. Kunkle, P. (2021, January 01). Novel Alzheimer Disease Risk Loci and Pathways in African American Individuals Using the African Genome Resources Panel. Retrieved from https://jamanetwork.com/journals/jamaneurology/article-abstract/2771828
4. Chou, V. (2019, February 27). How Science and Genetics are Reshaping the Race Debate of the 21st Century. Retrieved from http://sitn.hms.harvard.edu/flash/2017/science-genetics-reshaping-race-debate-21st-century/
5. Cohen, J., Pertsemlidis, A., Kotowski, I.K., Graham, R., Garcia, C.K., Hobbs, H.H. (2005). Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Retrieved from https://pubmed.ncbi.nlm.nih.gov/15654334/
6. Drug Response. (n.d.). Retrieved from https://www.sciencedirect.com/topics/medicine-and-dentistry/drug-response#:~:text=Drug response can be impacted,events108 or treatment failure.
7. How Do PCSK9 Inhibitors Lower Cholesterol? (2020, May 27). Retrieved from https://www.webmd.com/cholesterol-management/pcsk9-inhibitors-treatment#:~:text=PCSK9 inhibitors are a new,prevent heart attacks or strokes.
8. Hunter, P. (2014, October). The genetics of human migrations: Our ancestors migration out of Africa has left traces in our genomes that explain how they adapted to new environments. Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4253842/
9. JohnsonPharmD, J. A., Johnson, J. A., Julie A. Johnson From the Departments of Pharmacy Practice, & Johnson, C. T. (2008, September 23). Ethnic Differences in Cardiovascular Drug Response. Retrieved from https://www.ahajournals.org/doi/full/10.1161/CIRCULATIONAHA.107.704023
10. Kurnik, D., Li, C., Sofowora, G. G., Friedman, E. A., Muszkat, M., Xie, H., Stein, C. M. (2008, October). Beta-1-adrenoceptor genetic variants and ethnicity independently affect response to beta-blockade. Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757009/
11. Orrange, S. (2018, November 09). Choosing Your Blood Pressure Medication: What Type Is Best for You? – GoodRx. Retrieved from https://www.goodrx.com/blog/choosing-your-blood-pressure-medication-what-type-is-best/#:~:text=Calcium channel blockers, namely amlodipine,pressure, specifically in African Americans.
12. What is precision medicine?: MedlinePlus Genetics. (2020, September 22). Retrieved from https://medlineplus.gov/genetics/understanding/precisionmedicine/definition/
13. Whole Genome Association Studies. (2011, July 15). Retrieved from https://www.genome.gov/17516714/2006-release-about-whole-genome-association-studies
The rapid development of genetic sequencing technology has contributed to the growing field of genomic research (10). Millions of dollars in government funds have been allocated towards genome-wide association studies (GWAS) (10). Genome-wide association studies are genetic research-based initiatives that scan the genomes from different people to look for genetic markers that could predict a disease’s presence (8). While genomic technology has progressed expeditiously over the past two decades, diversity in these studies’ participant pool has not (6,7,8). Most genetic data used in GWAS are derived from people with European ancestry (6,7,8). The underrepresentation of minority groups in genomic research has led to a lack of inclusivity in this field of study that provides disproportionate health benefits to people of European descent (7,8). Although genomic findings from a genetically similar participant pool can help people from different ancestral backgrounds, creating a more inclusive environment in genomic research would prevent over-generalizations of gene criteria for diseases from being made and aid in future scientific research.
Currently, approximately 80% of genetic research data comes from participants of European descent (7). Although humans are 99.9% genetically related to one another, the 0.1% variance in human genetic material provides scientists with valuable information on the causes of diseases in specific demographics and the appropriate course of action to treat diseases in these populations (8). The lack of genetically diverse data in these studies contributes to the high rates of inaccurate diagnostic results that many minority groups receive (6,7). For example, the identification rates of pathogenic genetic variants for nonsyndromic hearing loss (NSHL) are consistently lower in people of African ancestry when compared to people of European ancestry (5). Among some European/Asian populations, up to 50% of people with severe NSHL have pathogenic variants in their GJB2 gene (9). However, this pathogenic variant in the GJB2 gene is extremely rare in people of African descent (9). The paucity of data relating to NSHL in African populations limits geneticists’ ability to diagnose this syndrome in this populous, and results in valuable treatment time for this syndrome being lost (5). Similarly, a recent research study showed that a genetic test for cardiomyopathy, a hereditary disease of the heart muscle, yielded six false-positive results in 230 people of African descent (5). This was because the variants tested were more common in healthy patients with African ancestry than in healthy patients with European ancestry (5). These examples provide a glimpse into the ramifications of using primarily European-based genomic findings to create a standard criterion for the testing and diagnosing of hereditary diseases in diverse populations.
Looking at the current global health crisis, it is apparent that the waiving of a genetically diverse participant pool is not just a pattern of the past. The demographics for the late-stage Pfizer-BioNTech* vaccine clinical trials show that of 40,277 participants, 81.9% were white, 9.8% were black, 4.4% were Asian, 0.6% were American Indian/Alaska Native, and 0.2% were Native Hawaiian or Other Pacific Islander (4). These statistics show an apparent racial bias towards white participants and provide this demographic with an accuracy of clinical vaccine results that cannot be proportionately guaranteed in other populations because of their minute representation in the trials. The lack of diversity within these participants is especially alarming when one takes into account that recent studies suggest that Coronavirus vaccines’ efficacy may be lower in racial minorities (1,5). This issue can also be seen when extended to a global scope. Globally,healthcare systems have failed to invest in coronavirus vaccine trials in Africa (2). This failure to invest in trials “of the most genetically diverse continent on earth, therefore, threatens the generalizability of trial findings” (2). Although genomic findings contributed to by European ancestral data can provide broad predictions of human’s reaction to a vaccine, genetic variants consistent in different populations may incite a different immunity response from one global population to the next (2,3). The participant pool must account for these variants, so over-generalizations about common reactions to certain vaccines can be avoided, and the vaccine can maximize its health benefits.
For genomic studies to continue to grow, scientists must expand their participant selections to ensure the inclusivity of all. Not only would this allow for more data to be present on how to diagnose certain conditions in people of different ancestral backgrounds, but it would also ensure that future genetic research and diagnostic tests provide the same standard of care for everyone.
1. Balfour, H. (2020, December 3). COVID vaccines may be less effective for racial minorities, find scientists. European Pharmaceutical Review. https://www.europeanpharmaceuticalreview.com/news/135380/covid-19-vaccines-maybe-less-effective-for-racial-minorities-suggest-computer-scientists/.
2. Boms, O. M. K. W., Korte, M., & Fawzi, W. (2021, January 20). A Case For More (And More Ethical) COVID-19 Vaccine Trials In Africa: Health Affairs Blog. Health Affairs. https://www.healthaffairs.org/do/10.1377/hblog20210112.870609/full/.
3. Idoko, Olubukola T et al. “Impact, challenges, and future projections of vaccine trials in Africa.” The American journal of tropical medicine and hygiene vol. 88,3 (2013): 414-9. doi:10.4269/ajtmh.12-0576
4. Kates , J., & Artiga, S. (2021, January 26). Racial Diversity within COVID-19 Vaccine Clinical Trials: Key Questions and Answers. KFF. https://www.kff.org/racial-equity-andhealth-policy/issue-brief/racial-diversity-within-covid-19-vaccine-clinical-trials-keyquestions-and-answers/.
5. Kent, J. (2020, December 1). AI Shows COVID-19 Vaccines May Be Less Effective in Racial Minorities. HealthITAnalytics. https://healthitanalytics.com/news/ai-shows-covid-19-vaccines-may-be-less-effective-in-racial-minorities.
6. Lee, C. (2019, September 1). Opinion: Greater Diversity Is Needed in Human Genomic Data. The Scientist Magazine®. https://www.the-scientist.com/critic-at-large/diversifyour-human-genomic-data-66308.
7. Mapes, D. (2019, June 19). Lack of diversity in genetic research a problem. Fred Hutch.
8. Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. Nature, 538(7624), 161–164. https://doi.org/10.1038/538161a
9. Shearer, A. E., Hildebrand, M. S., & Smith, R. J. H. (2017, July 27). Hereditary Hearing Loss and Deafness Overview. GeneReviews® [Internet]. https://www.ncbi.nlm.nih.gov/books/NBK1434/.
10. U.S. Department of Health and Human Services. (2019, August 21). NIH funds genetic counseling resource ahead of million-person sequencing effort. National Institutes of Health. https://www.nih.gov/news-events/news-releases/nih-funds-genetic-counselingresource-ahead-million-person-sequencing-effort.
From the Human Genome Project to direct-to-consumer genetic testing, the field of genomics is rapidly evolving. However, amidst such progress, the diversity of participants in genomic research lags significantly behind. Despite making up 16% of the global population, around 78% of the individuals studied in genome-wide association studies (GWAS) are of European descent (6,9). Although humans share a large portion of DNA, the inclusion of more diverse populations in genomic studies will maximize discovery in future research efforts and improve clinical practices on a global scale.
Population representations in current genomic studies are paradoxical to research purposes. A single population only contains a subset of overall genetic data; therefore, the use of homogeneous participants in a genomic study limits the robustness of the data obtained. When uncovering the vast amount of genetic data underlying all humans, a single population will not suffice. Due to migrations and environmental adaptations, populations may have different arrangements of genetic variants that contribute to disease. This phenomenon is evidenced in allelic heterogeneity. For example, while ΔF508 in the CFTR gene accounts for over 70% of cystic fibrosis cases in Europeans, it only accounts for 29.4% of cystic fibrosis cases in the African Diaspora (6,7). Thus, information obtained from European populations may not always be applicable to other populations.
By exclusively studying Europeans, genomic studies are bound to miss variants that have a low frequency or are absent in European populations. For instance, rare nonsense variants in PCSK9, a gene responsible for regulating cholesterol levels, were discovered by studying individuals of African descent; Europeans had too low a frequency of these variants for analysis (1,6). These nonsense variants led to the development of PCSK9 inhibitors, a drug responsible for lowering cholesterol levels and the risk of cardiovascular disease (1,6). PCSK9 inhibitors benefit individuals across any population, even in Europe where research was precluded. By disregarding genetic data from diverse populations, researchers risk missing out on information that could benefit all.
The human reference genome, GRCh38, is invaluable for genomic research. By comparing individual genomes to the reference, researchers can identify single nucleotide polymorphisms in an individual or population that may contribute to disease. However, over 300 million base pairs of DNA from African American participants ‒ approximately 10% of the human genome ‒ were found missing in GRCh38 (8). Without a complete database, critical variants are missed, inhibiting potential genomic findings. In terms of future research, a more representative reference genome will allow researchers to find disease-variants and their genomic locations more accurately (2).
The lack of diversity in genomic studies limits the usefulness of data and even induces errors when diagnosing diseases. Five genetic variants linked to hypertrophic cardiomyopathy were mistakenly classified as pathogenic when they were in fact benign (4). Based on this error, various African Americans were misdiagnosed. Had initial research cohorts included African American participants, these misclassifications would likely have been prevented (4). In addition to diagnoses, many risk assessments such as polygenic risk scores are based on GWAS conducted predominantly in Europeans. Because of linkage disequilibrium and heterogeneity, these assessments are less accurate across non-European populations (6). Inaccurate risk assessments and misdiagnosis may exacerbate illness and compromise adequate treatment. Accounting for variation in disease-variants across diverse populations will facilitate appropriate clinical intervention.
Another clinical application that utilizes information from genomic studies is drug therapy. Differences in pharmacogenetic factors, such as genetic polymorphisms in metabolic pathways, account for variation in drug efficacy, dosage requirements, and drug toxicity across populations (3,5). Such information is critical for guiding clinical decisions, including drug type and dosage, but is absent for most diverse populations. Dosage algorithms for various drugs, such as warfarin, are based on research derived from European populations. Not surprisingly, these algorithms do not translate into safe and effective treatment across non-European populations (1,3). To provide comprehensive and safe guidelines for drug usage for all, genetic variants influencing drug metabolism across diverse populations must be identified. Doing so will facilitate precision medicine, allowing healthcare providers to personalize drug therapy and optimize prognosis.
As genomic studies continue to neglect genetic variation across populations, genomic research will never reach its maximum potential. Ultimately, including diverse populations in genomic studies will improve health by developing drugs, diagnostic tools, and guidelines applicable to more people. In the end, the human genome is still a puzzle. To solve the puzzle ‒ to better understand the role of genetics in health and disease ‒ diverse populations must be included in genomic studies.
1. Bentley, A. R., Callier, S., & Rotimi, C. N. (2017). Diversity and inclusion in genomic research: why the uneven progress?. Journal of community genetics, 8(4), 255–266. https://doi.org/10.1007/s12687-017-0316-6
2. Ganguly, P. (2019, September 24). Advancing the reference sequence of the human genome. https://www.genome.gov/news/news-release/NIH-funds-centers-for-advancing-sequence-of-human-genome-reference
3. Hernandez, W., Gamazon, E. R., Aquino-Michaels, K., Patel, S., O’brien, T. J., Harralson, A. F., . . . Perera, M. A. (2013). Ethnicity-specific pharmacogenetics: The case of warfarin in African Americans. The Pharmacogenomics Journal, 14(3), 223-228. doi:10.1038/tpj.2013.34
4. Manrai, Arjun K.; Funke, Birgit H.; Rehm, Heidi L.; Olesen, Morten S.; Maron, Bradley A.; Szolovits, Peter; Margulies, David M.; Loscalzo, Joseph; Kohane, Isaac S. (2016). Genetic Misdiagnoses and the Potential for Health Disparities. New England Journal of Medicine, 375(7), 655–665. doi:10.1056/NEJMsa1507092
5. Ramamoorthy, A., Pacanowski, M. A., Bull, J., & Zhang, L. (2015). Racial/ethnic differences in drug disposition and response: review of recently approved drugs. Clinical pharmacology and therapeutics, 97(3), 263–273. https://doi.org/10.1002/cpt.61
6. Sirugo, Giorgio; Williams, Scott M.; Tishkoff, Sarah A. (2019). The Missing Diversity in Human Genetic Studies. Cell, 177(1), 26–31. doi:10.1016/j.cell.2019.02.048
7. Stewart, C., & Pepper, M. S. (2017). Cystic Fibrosis in the African Diaspora. Annals of the American Thoracic Society, 14(1), 1–7. https://doi.org/10.1513/AnnalsATS.201606-481FR
8. Widely used reference for the human genome is MISSING 300 million bits of DNA. (2018, November 19). https://www.hopkinsmedicine.org/news/newsroom/news-releases/widely-used-reference-for-the-human-genome-is-missing-300-million-bits-of-dna
9. Wu, K. J. (2019, March 21). Lack of diversity in genetic research could be costing us our health. https://www.pbs.org/wgbh/nova/article/lack-diversity-genetic-research-could-be-costing-us-our-health/
You pick up your genetic test results and you find out that you’ve been diagnosed with a genetic disease. But is it correct? You’re a victim of an issue that has been plaguing the genetic community and continues to do so today: Genetic test disparities. While some ethnic groups have volunteered to participate in studies, others are heavily underrepresented, resulting in a missing gap in information. But what difference can this imbalance make? The lack of participation among minority ethnicities may prompt unintended consequences down the line; therefore, the incorporation of varied ethnicities has the capability to fill in the gaps of the greater genetic puzzle.
Despite individuals with European background being the most consistent participants in genomic studies, findings do not account for variations among other ethnic groups and therefore should not be applicable. This shortage of data can be attributed to a lack of participation among minorities, indicating the hesitancy towards procedures and privacy. In 2018, approximately 80% of participants in genomic studies were of European descent, with non-European and non-Asian groups constituting only 17% . This trend continues with a decreased population participating in genomic studies in recent years, evidenced by the hesitancy demonstrated by the African American population when it came down to genetic study participation. Surveys indicated that part of the reason for decreased participation was due to the distrust and the utilization of their genetic data for unknown purposes, which is unfortunate considering their information can be useful for the betterment of others.
Due to the fact that ethnicity plays a significant role in medicine, slight inconsistencies among ethnic data can cause improper drug administration as certain ethnic groups have developed resistance to certain drugs . For example, the CYP2C9 gene is an important factor when developing personalized drug therapy for patients, helping to mediate metabolism . However, the alleles are linked to various phenotypes, which determine how resistant an individual is to a drug. In a 2014 GWAS, other ethnic groups (e.g. Asians, African Americans, Latinos) were found to have a higher therapeutic range for the warfarin drug, which indicated that further studies must be performed to determine the optimal dosage during administration to achieve the desired effect . This difference was demonstrated as the CYP2C9*2 and *3 alleles were found to be common amongst individuals with European background, while the CYP2C9*8 and *11 alleles were found to be prevalent in African Americans . Due to the increase in frequency of the allele among African American populations, a higher dosage would be required as compared to Europeans. However, if the genetic differences are not accounted for, inadequate drug dosage will be given, limiting its effectiveness.
Another shortcoming of the broad scope of ethnic exclusion is the lack of accuracy in the prediction of disease onset and treatments [7,10]. When using polygenic risk scores (PRS) to determine the risk of disease, carrier frequencies vary across the ethnic groups.  An increase in the frequencies of certain diseases (e.g. diabetes, schizophrenia, cystic fibrosis) in certain ethnic groups have prompted studies to determine proper genetic risk [2,3,8]. However, there are also limitations to using PRS, being that they are calculated using an individual’s risk alleles as well as data from GWAS. But, since GWAS studies are primarily based off of those with European ancestry, the risk score itself is inaccurate and cannot provide many individuals with a risk for that certain disease [2,10]. In a study by Curtis D. in 2018 in which the PRS for schizophrenia was evaluated, after the genotypes were collected, there was a huge disparity between Africans and European, with a factor of 10 .
In an era when modern medicine is at its pinnacle, it is crucial to improve upon our expanding knowledge so that our progression is not diminished and we can continue to delve into the pieces that comprise who we are. In order to get the bigger picture, it is imperative that all ethnic groups are included. While genetic variants can vary even within each ethnicity, it is crucial that individuals step up for the greater benefit of the group, as their genetic information serves as valuable contributions. The effort to increase diversity must be a collaborative effort on both ends, researchers and participants. Perceptions of genetic studies have been low and wavering, but in order to repair the trust and bridge the gap, individuals must be convinced that they have the ability to make an impact.
 Sirugo G., et al. “The Missing Diversity in Human Genetic Studies”Cell 177(1) (2019): 26-31.
 Curtis D. “Polygenic Risk Score for Schizophrenia is More Strongly Associated with Ancestry Than With Schizophrenia” Psychiatric Genetics 28(5) (2018): 85-89.
 Padoa, C., et al. “Cystic Fibrosis Carrier Frequencies in Populations of African Origin” Journal of Medical Genetics 36(1) (1999): 41–44.
 Abi-Rached L., et al. “Immune Diversity Sheds Light on Missing Variation in Worldwide Genetic Diversity Panels” PLoS ONE 13(10) (2018).
 Khera A.V., Chaffin M., et al. “Genome-wide Polygenic Scores for Common Diseases Identify Individuals With Risk Equivalent to Monogenic Mutations” Nature Genetics 50(9) (2018): 1219-1224.
 McPherson E. “Genetic Diagnosis and Testing in Clinical Practice” Clinical Medicine and Research 4(2) (2006):123-9.
 Fine, M. et al. “The Role of Race and Genetics in Health Disparities Research” American Journal of Public Health 95(12) (2005): 2125–2128.
Li, Y. et al. Genetic Factors Associated With Cancer Racial Disparity – an Integrative Study Across Twenty‐one Cancer Types” Molecular Oncology¨14(11) (2020): 2775-2786.
 Zhang F, et al. “Inconsistency in Race and Ethnic Classification in Pharmacogenetics Studies and its Potential Clinical Implications” Pharmgenomics Personal Medicine 12 (2019):107-123.
Martin, A. et al. “Clinical use of Current Polygenic Risk Scores may Exacerbate Health Disparities” Nature Genetics 51 (2019): 584–59.
The ever-increasing developments in the fields of genetics and genomics have broadened humanity’s insights into numerous scientific knowledge particularly in the health sciences. It is worth noting however, that there exists a significant flaw in the current fields of genetics and genomics which pose considerable negative consequences on the applications that rely upon the aforementioned disciplines. The flaw lies in the fact that published genome-wide association studies (GWAS) which aim to shed light on the correlations of genetic and phenotypic profiles of human individuals and populations employ participants of predominantly European descent (Popejoy & Fullerton, 2016; Sirugo et al., 2019).
It is then scientifically imperative to push for inclusion of cohorts of diverse ancestries in GWAS in order to erase the so-called “European bias” in genetics and genomics. The diversification of representation in GWAS is considerably beneficial as it aims to decrease the bias and increase the accuracy and precision of insights into fields of knowledge particularly genetic architecture, genetically-driven clinical diagnoses, and pharmacogenetics which will be of advantage to the improvement and advancement of universal human healthcare.
It is significant to take into consideration the importance of GWAS in populations of diverse descent as these studies may provide new information regarding the genetic architectures of different ancestries and how they differ from one another. The aforementioned notion is reinforced by the report of a genome-wide scan of a population of Greenlandic Inuit ancestry in 2015 (Fumagalli et al., 2015). The study discovered that the studied population and populations of European ancestry share a single nucleotide polymorphism (SNP) in the genetic composition of their fatty-acid enzyme which accounts for growth hormone activity and height, which have been overlooked by previous GWAS in populations of European ancestry due to its low allelic frequency in the aforementioned population. (Fumagalli et al., 2015; Popejoy & Fullerton, 2016). This implies that there exist numerous similarities and discrepancies in the genetic architectures of environmentally and ancestrally different individuals which could be further understood through the inclusion of diverse ancestral cohorts in GWAS.
Genetically-Driven Clinical Diagnoses
Genetic studies that solely rely on genetic data from populations of a single descent then fail to consider the existence of discrepancies in the genetic architecture of humans which could imply negative consequences in the advancement of human healthcare. A particular negative consequence of underrepresentation in GWAS is clinical misdiagnosis. This is exemplified by the case of misdiagnoses of hypertrophic cardiomyopathy in patients of African ancestry as existing data on the nature of the aforementioned disease and its variants are based on GWAS whose participants are predominantly of European ancestry (Bentley et al., 2017). Cases that were diagnosed as benign in patients of African ancestry whom in truth, actually possessed severe pathogenic variants of the aforementioned disease which translated into misguided clinical and medical measures and procedures that may pose threats to the overall health and welfare of the patients of African descent and patients of other ancestries that suffer from hypertrophic cardiomyopathy (Bentley et al., 2017; Manrai et al., 2016)
A major aspect of the health sciences that rely on GWAS is pharmacogenetics which studies the appropriate drug treatments for diseases with respect to the genetic profiles of the patients. A scientifically documented case of the consequences of the lack of diversity of GWAS on pharmacogenetics is the case of the correlation of warfarin drug metabolism and patient ancestry. Prescription of warfarin doses rely heavily on algorithms based on data of variants of SNP in the genes of patients which are predominantly from populations of European ancestry thus catering less efficiently to patients of other descent (Johnson et al., 2017). The reliance of the pharmacogenetic algorithms on European descent dominated genetic mapping then poses a considerable harm to patients of underrepresented ancestries who are in need of warfarin as treatment.
The lack of representation of populations of diverse ancestries in genome-wide association studies poses a significant concern as it translates to a lack of accurate and cohesive knowledge on the genetic architectures, genetic disease trends, and pharmacogenetic mechanisms of diverse populations of humans. The overrepresentation of a single ancestry particularly the European ancestry in GWAS translates to less inclusivity with regards to universal health data and applications. It is then a scientific imperative for the scientific community to strive to further promote equal representation of individuals of diverse ancestries in GWAS in order to secure data and knowledge that would be of great significance in advancing human health forward.
Bentley, A. R., Callier, S., & Rotimi, C. N. (2017). Diversity and inclusion in genomic research: why the uneven progress? Journal of Community Genetics, 8(4), 255–266. https://doi.org/10.1007/s12687-017-0316-6
Fumagalli, M., Moltke, I., Grarup, N., Racimo, F., Bjerregaard, P., Jørgensen, M. E., Korneliussen, T. S., Gerbault, P., Skotte, L., Linneberg, A., Christensen, C., Brandslund, I., Jørgensen, T., Huerta-Sánchez, E., Schmidt, E. B., Pedersen, O., Hansen, T., Albrechtsen, A., & Nielsen, R. (2015). Greenlandic Inuit show genetic signatures of diet and climate adaptation. In Science (Vol. 349, Issue 6254). American Association for the Advancement of Science. https://doi.org/10.1126/science.aab2319
Johnson, J. A., Caudle, K. E., Gong, L., Whirl-Carrillo, M., Stein, C. M., Scott, S. A., Lee, M. T., Gage, B. F., Kimmel, S. E., Perera, M. A., Anderson, J. L., Pirmohamed, M., Klein, T. E., Limdi, N. A., Cavallari, L. H., & Wadelius, M. (2017). Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for Pharmacogenetics-Guided Warfarin Dosing: 2017 Update. Clinical Pharmacology and Therapeutics, 102(3), 397–404. https://doi.org/10.1002/cpt.668
Manrai, A. K., Funke, B. H., Rehm, H. L., Olesen, M. S., Maron, B. A., Szolovits, P., Margulies, D. M., Loscalzo, J., & Kohane, I. S. (2016). Genetic Misdiagnoses and the Potential for Health Disparities. New England Journal of Medicine, 375(7), 655–665. https://doi.org/10.1056/nejmsa1507092
Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. In Nature (Vol. 538, Issue 7624, pp. 161–164). Nature Publishing Group. https://doi.org/10.1038/538161a
Sirugo, G., Williams, S. M., & Tishkoff, S. A. (2019). The Missing Diversity in Human Genetic Studies. In Cell (Vol. 177, Issue 1, pp. 26–31). Cell Press. https://doi.org/10.1016/j.cell.2019.02.048
Prospects of Genetically-Diverse Research
Since the Human Genome Project’s conclusion, genomic research has surged in innovation. Genome-wide association studies, GWAS, examining the genetic variants associated with disease have led to new diagnosis and treatment methods. However, GWAS have historically failed to capture the full scope of genetic diversity. As of 2018, 78% of individuals included in GWAS were European, 10% were Asian, 2% were African, and 1% were Hispanic (Sirugo et al., 2019). These discrepancies mean that many GWAS findings are dangerously incomplete and cannot be applied to all ethnic groups. Genomic research must include diverse ethnoracial populations to adequately address the large variation in human genomes, tackle current disparities in disease diagnosis and treatment, and produce novel genetic discoveries.
Although the majority of genetic information is shared by all humans (99.9% of the genome), the relatively small amount of variation has vast effects in disease causation (National Human Genome Research Institute, 2018). Allele frequency differences and the presence of population-specific variants have arisen over time due to migration and adaptive selection, genetic drift with population bottlenecks, and other forces (Gurdasani et al., 2019). Genetic diversity has profound effects on GWAS findings and analysis of genetic risk factors. When studying complex diseases involving multiple genes or modifiers, Eurocentric GWAS may miss variants that are uncommon or absent in European populations. With these diseases, genetic diversity can lead to different genetic risk factors, disease severity levels, and treatment efficacies, as seen with cystic fibrosis. A disease common in Europeans but rare in African Americans, cystic fibrosis, CF, is impacted by allelic heterogeneity: many mutations in the cystic fibrosis transmembrane conductance regulator gene can lead to CF (Stewart & Pepper, 2016). Although more than 2,000 rare mutations in the gene exist, there is a lack of knowledge on the specific polymorphisms. Basing CF identification on European risk factors has resulted in underdiagnosis of African-descendants (Sirugo et al., 2019). Including a higher proportion of African-descendants in CF GWAS would improve the understanding of African-specific genetic risk factors and reduce the discrepancy in diagnosis accuracy. Continuing to neglect under-studied populations in GWAS will exacerbate inequalities surrounding precision and accuracy in disease diagnosis and treatment.
Inclusivity of under-studied populations in GWAS may also improve drug efficacy and safety, as is the case with warfarin. Dose requirements of the anticoagulant vary widely and the therapeutic range is narrow. Variation in dosage effect between individuals is determined by single nucleotide polymorphisms on several genes. Algorithms that analyze genotypes to administer appropriate dosages have been developed using primarily European-based GWAS. However, the influence of these genotypes on drug metabolism differs between ethnoracial populations due to allele frequency variation. For instance, the CYP2C9*2 variant is common and influential in Europeans but nearly nonexistent in Asians (Johnson et al., 2017). Therefore, the European-derived algorithms cannot determine effective and safe dosages for all ethnic groups, reducing treatment accessibility. By examining the genetic variants present in diverse populations, GWAS could create new algorithms that accurately determine drug dosages for all groups.
The inclusion of more genetic diversity in GWAS has the potential to produce innovative genomic research in the future. Examining the genomes of previously ignored populations can lead to discoveries of novel genetic variants or mechanisms underlying genetic expression. Interpretations of genetic tests, understanding of human evolution, and implementations of precision medicine may all be improved. Unfortunately, recruiting under-represented populations for research is time and resource-intensive. Additionally, individuals may be deterred from participating for socioeconomic or cultural reasons: loss of income due to time away from work, lack of transportation access, and language barriers (Hindorff et al., 2018). What’s more, some groups may be distrustful of biomedical research due to historical exploitation and abuse. These barriers may discourage researchers from recruiting under-represented populations, so funding agencies must promote inclusion by providing additional funding to researchers seeking to diversify GWAS (Hindorff et al., 2018). To encourage under-represented populations to participate in GWAS, community engagement and accessibility need to be emphasized. Collaboration must be developed by providing candid information and education on the importance of genomic research. Including researchers from the community can also establish trust and overcome language barriers. Incentives such as free transportation, medical treatment, and payments can address socioeconomic impediments and further encourage individuals to participate. Finally, to prevent malpractice and inappropriate treatment of study subjects, research proposals should be reviewed by local ethics committees (Sirugo et al., 2019). Together, these efforts can diversify the population representation in genomic studies to amend current disparities and pioneer new research.
Consortium (CPIC) Guideline for Pharmacogenetics-Guided Warfarin Dosing: 2017 Update.
Clinical pharmacology and therapeutics, 102(3), 397–404. https://doi-org.liboff.ohsu.edu/10.1002/cpt.668
Gurdasani, D., Barroso, I., Zeggini, E., & Sandhu, M. S. (2019). Genomics of disease risk in
globally diverse populations. Nature reviews. Genetics, 20(9), 520–535. https://doi-org.liboff.ohsu.edu/10.1038/s41576-019-0144-0
Hindorff, L. A., Bonham, V. L., Brody, L. C., Ginoza, M., Hutter, C. M., Manolio, T. A., &
Green, E. D. (2018). Prioritizing diversity in human genomics research. Nature reviews. Genetics, 19(3), 175–185. https://doi-org.liboff.ohsu.edu/10.1038/nrg.2017.89
Johnson, J. A., Caudle, K. E., Gong, L., Whirl-Carrillo, M., Stein, C. M., Scott, S. A., Lee, M. T.,
Gage, B. F., Kimmel, S. E., Perera, M. A., Anderson, J. L., Pirmohamed, M., Klein, T. E., Limdi, N. A., Cavallari, L. H., & Wadelius, M. (2017). Clinical Pharmacogenetics Implementation
National Human Genome Research Institute. (2018, September 7). Genetics vs. genomics fact
sheet. Retrieved from https://www.genome.gov/about-genomics/fact-sheets/Genetics-vs-Genomics
Sirugo, G., Williams, S. M., & Tishkoff, S. A. (2019). The Missing Diversity in Human Genetic
Studies. Cell, 177(1), 26–31. https://doi-org.liboff.ohsu.edu/10.1016/j.cell.2019.02.048
Stewart, C., & Pepper, M. S. (2016). Cystic fibrosis on the African continent. Genetics in
medicine : official journal of the American College of Medical Genetics, 18(7), 653–662. https://doi-org.liboff.ohsu.edu/10.1038/gim.2015.157
Genome-Wide Association Studies (GWAS) involve scanning markers across the genomes of people to locate genetic variations associated with particular traits or diseases (1). GWAS are a first-choice as they bridge genotype with phenotype by associating chromosomal locations (loci) with the trait (2). However, 78% of the individuals included in GWAS are of European ancestry (3). Accuracy of genomic findings is firmly-rooted in the representations of ancestries. The under-representation of different ancestries may lead to incomplete or worse, incorrect findings (4).
For instance, Heidi Rehm, Chief Genomics Officer, Centre for Genomic Medicine, Massachusetts studied the blood sample from a foetus which had a mutation in the PTPN11 gene. The mutation was marked “pathogenic” for Noonan’s disease, a rare disorder affecting the heart and growth, based on genomic studies conducted on people of European ancestry. This finding had been proven incorrect later as many people of an ethnic group carried the gene’s benign variant. When Heidi discovered this, it was too late. The foetus had been aborted (5).
Trying to use genomic studies done on European ancestry individuals to represent other ancestries is an attempt to measure people of myriad ancestries with the same genomic yardstick.
It does not account for environment-gene interaction. Genes are not a steadfast blueprint for hereditary (6). Their expression varies with environmental stimulus determined by cultural and geographical differences. X-linked G6PD deficiency and favism will only be triggered by fava bean consumption causing haemolytic anaemia (7). Fava beans are found in Asian, Middle Eastern, South American, and African cuisines.
Differences in genetic backgrounds (G x G) and gene-environment (G x E) interactions can also cause epistasis, wherein different genes interact, influencing phenotype. A multi-ethnic study identified variants in 4 genetic loci interacting with physical activity to influence blood-lipid levels (G x E interaction). 2 out of 4 loci (SNTA1 and CNTNAP2) were identified as people of African and Hispanic ancestry showed a relatively high frequency of variants. Had the study only been performed in European-ancestry individuals, this conclusion would not have been reached (8).
Replicating genomic findings from people of European ancestry on other ancestries may lead to under-assessment of risk. Cystic Fibrosis is a Mendelian disease widespread in European-ancestry individuals (1 in 2000–3000 births) but rarer in African-American-ancestry individuals (1 in 17,000 births). It is underdiagnosed in the African-American ancestry. In Europeans, the most common causative allele is ΔF508 (70%). ΔF508 is the causative allele in only 29% of African-Americans (9). Thus, even in identified cases, causes may be misinterpreted.
Genomic findings from European ancestry-individuals cannot be used to asses drug efficacy on other ancestries. An example is the use of oseltamivir, a drug widely stockpiled for a possible avian influenza pandemic. Neuropsychiatric disorders and severe skin reactions were associated with its use in some populations, primarily in Japanese-ancestry individuals. This was caused by a Single Nucleotide Polymorphism which occurred in 9.29% of the Asian population but none of the European population (10).
Thus, population representation can rectify errors in genomic findings, improve healthcare and make studies more accurate. Daniel MacArthur of the Massachusetts General Hospital conducted a study on 60,000 people of ethnically-diverse groups. Out of 53 variants of genes previously classified “pathogenic”, only 9 truly were (11).
Population representation sheds light on evolutionary history making findings more meaningful. For instance, they offer insights into how genetic conditions are favoured for resistance against tropical diseases. Variants in APOL1 associated with kidney-disease risk, are prevalent in African-American populations as they offer resistance against African trypanosomiasis (12).
Population representation holds potential benefits for public healthcare by also improving drug efficacy. Enzyme-deficient individuals are at risk of haemolysis due to several drugs including primaquine, a drug used for malaria treatment (13). Population representation identifies such discrepancies.
Population representation ensures proper genetic-disease risk-analysis. Polygenic risk scores predict complex-genetic-diseases, computed from a genetic-variants sample (14). Inclusion of diverse ancestries makes them more accurate for under-represented individuals.
Population representation accounts for population-specific mutation, gene-environment interaction and individual genetic backgrounds. It can improve precision medicine, tailoring treatment per individual-needs. It will improve public health by accurate diagnoses and drug efficacy.
Even during the SARS-Cov-2 pandemic, a study conducted by the COVID-19 Host Genetics Initiative between 3,199 patients and a control group showed that a gene cluster on chromosome-3 is a risk for severe symptoms. It is carried by 16% of Europe-ancestry individuals compared to 50% prevalence in South-Asians (15). Population representation brought this to light.
Research must bring myriad ancestries out of the shadows and into the limelight of genomic studies.
(1) “Genome-Wide Association Studies (GWAS).” Genome.Gov, www.genome.gov/genetics-glossary/Genome-Wide-Association-Studies#:~:text=A%20genome-wide%20association%20study,the%20presence%20of%20a%20disease.
(2) “What are genome wide association studies (GWAS)? | GWAS Catalog.” The European Bioinformatics Institute < EMBL-EBI, www.ebi.ac.uk/training/online/courses/gwas-catalogue-exploring-snp-trait-associations/what-is-gwas-catalog/what-are-genome-wide-association-studies-gwas/
(3) “Lack of diversity in genetic research a problem.” Fred Hutch, 19 June 2019, www.fredhutch.org/en/news/center-news/2019/06/lack-diversity-genetic-research-problem.html#:~:text=According%20to%20well-documented%20research,percent%20of%20the%20global%20population.
(4) Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” CELL, www.cell.com/fulltext/S0092-8674(19)30231-4
(5) Yong, Ed. “Why Human Genetics Research Is Full Of Costly Mistakes- The Atlantic.” The Atlantic, www.theatlantic.com/science/archive/2015/12/why-human-genetics-research-is-full-of-costly-mistakes/420693/
(6) Lobo, Ingrid. “Phenotypic Range of Gene Expression: Environmental Influence | Learn Science at Scitable.” Nature, www.nature.com/scitable/topicpage/phenotypic-range-of-gene-expression-environmental-influence-581/
(7) Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” CELL, www.cell.com/fulltext/S0092-8674(19)30231-4
(8) Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” CELL, www.cell.com/fulltext/S0092-8674(19)30231-4
(9) Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” CELL, www.cell.com/fulltext/S0092-8674(19)30231-4
(10) Li, CY., Yu, Q., Ye, ZQ. et al. A nonsynonymous SNP in human cytosolic sialidase in a small Asian population results in reduced enzyme activity: potential link with severe adverse reactions to oseltamivir. Cell Res 17, 357–362 (2007). https://doi.org/10.1038/cr.2007.27
(11) Yong, Ed. “Why Human Genetics Research Is Full Of Costly Mistakes- The Atlantic.” The Atlantic, www.theatlantic.com/science/archive/2015/12/why-human-genetics-research-is-full-of-costly-mistakes/420693/
(12) “Diversity and inclusion in genomic research: why the uneven progress?” PubMed Central (PMC), www.ncbi.nlm.nih.gov/pmc/articles/PMC5614884/#CR91
(13) “Modelling primaquine-induced haemolysis in G6PD deficiency.” PubMed Central (PMC), www.ncbi.nlm.nih.gov/pmc/articles/PMC5330681/
(14) “Polygenic risk scores.” Genome.Gov, www.genome.gov/Health/Genomics-and-Medicine/Polygenic-risk-scores
(15) Zeberg, H., Pääbo, S. The major genetic risk factor for severe COVID-19 is inherited from Neanderthals. Nature 587, 610–612 (2020). https://doi.org/10.1038/s41586-020-2818-3
With the out-of-Africa model of human ancestry, most genetic variants are common and shared across populations. However, the founder effect, genetic drift, selective pressures, and responses to different environments resulted in genetic diversity across human populations. Rare variants are restricted to related populations, with 86% to a continental group (1). Thus, the predominantly European reference genome lacks an abundance of DNA. A more diverse reference genome would contain additional 19–40 megabases, not in reference (11). Some variants are only common in certain populations and rare in the dominant European sample, thus they can only be evaluated effectively in diverse populations (2,7). Lack of diversity neglects changes that occurred in different populations. Thus, inferences such as variant associations with disease drawn from the current reference genome can be misleading, incomplete, and may setback genomic findings. Inclusion will improve human health, especially for individuals whose background does not match the reference.
Some variants, and their corresponding effects, could be missed if they are rare or absent in the current reference. Researchers have identified 65 new associations, by analyzing 50,000 non-Europeans (9). For instance, the association of PCSK9 mutations with low coronary heart disease risk was identified in African-descendants, as PCSK9 is rare in Europeans (5). The identification of a new variant almost exclusive in the Indian cohort, determined population-specific variants are important at affecting plasma B12 concentrations (10). If these studies, among many others, had been restricted to only the biased database, important genomic findings would have been missed.
A way to assess diseases is by assessing rarity across populations; if a variant is common it is less likely to be disease-causing. The skewed reference may lead to mistaken inferences that a variant is disease-causing, especially if that variant is rare among the European cohort but predominant in others (12). For instance, many African Americans were classified for hypertrophic cardiomyopathy by using European-ancestry samples. Later the variant was determined to be prevalent in African-ancestry individuals, making it unlikely to be disease-causing (7). There is also uncertainty. The lack of sequencing in minorities cannot fully determine if, and which variants are rare in minority populations, or if those variants are only rare in sequenced individuals. Due to the small population size, inferences cannot be accurately drawn. A larger inclusion of minority populations is needed to fully understand the effects of different genomes.
Inclusion will lead to improved human health, due to progress in medicine. A more accurate clinical diagnosis could be performed, due to a better understanding of the association of variants with disease. For example, the prevalent cause of Cystic Fibrosis in Europeans is a mutation rare in African-descendants. A different mutation accounts for CF more often in African-descendants. With the low frequency of the European mutation in African-descent patients, CF is underdiagnosed (12). With better variant-disease understanding coming with diversity, the field of researching for treatments would benefit all. Neglect of diversity limits pharmacogenomic research, putting underrepresented patients at higher risk for unanticipated drug responses (5). For instance, population-specific variants could disrupt known drug binding sites (6). Or, G6PD enzyme deficiency could put patients at risk of hemolysis for drugs used for malaria (12). Furthermore, Polygenic-Risk Scores (PRS) are obtained by genetic variants from a discovery sample, then applying it to genetic profiles from other individuals to predict their risk for disease (12). PRS can only capture genetic associations observed in the population in which the study was conducted (3). With the biased sample, PRS in non-European patients is often misleading, with an accuracy 4.5 times lower in African-descendants (3). A consensus genome was created using diverse populations, leading to an improvement in gene mapping error rates from 9% to 4%, and six times fewer mistakes in gene expression measurement compared to the reference genome, which could open medical advancements (8). With lack of diversity, non-European individuals are receiving a lower level of medical care.
Findings from the current reference should not be used in other populations. Although most variants are common across populations, the regulatory variant that leads to its expression may differ across ancestry groups. Furthermore, different populations have different linkage disequilibrium, allele frequency, and genetic admixture which would be neglected if the findings of European samples were used. Inclusion of diverse populations is beneficial, leading to an advancement in medicine by providing the same level of care to non-Europeans, and sparking new medical development due to discovery of variant-associations. It will also lead to newer genomic findings by limiting inaccurate, and misleading inferences.
1. Auton, Adam, and Gonçalo R. Abecasis. “A Global Reference for Human Genetic Variation.” Nature, 30 Sept. 2015, www.nature.com/articles/nature15393. Accessed 22 Feb. 2021.
2. Bentley, Amy R., et al. “Diversity and Inclusion in Genomic Research: Why the Uneven Progress?” PubMed Central, NCBI, 18 July 2017, www.ncbi.nlm.nih.gov/pmc/articles/PMC5614884/. Accessed 22 Feb. 2021.
3. —. “Evaluating the Promise of Inclusion of African Ancestry Populations in Genomics.” NPJ Genomic Medicine, Springer Nature, 25 Feb. 2020, www.nature.com/articles/s41525-019-0111-x. Accessed 22 Feb. 2021.
4. Bethesda. “NIH Curriculum Supplement Series [Internet].” NCBI, 2007, www.ncbi.nlm.nih.gov/books/NBK20363/. Accessed 22 Feb. 2021.
5. Cohen, Jonathan, et al. “Low LDL Cholesterol in Individuals of African Descent Resulting from Frequent Nonsense Mutations in PCSK9.” Nature Genetics, 16 Jan. 2005, www.nature.com/articles/ng1509. Accessed 22 Feb. 2021.
6. Dopazo, Joaquín, et al. “267 Spanish Exomes Reveal Population-Specific Differences in Disease-Related Genetic Variation.” PubMed, National Library of Medicine, May 2016, pubmed.ncbi.nlm.nih.gov/26764160/. Accessed 22 Feb. 2021.
7. Hindorff, Lucia A. “Prioritizing Diversity in Human Genomics Research.” PubMed Central, NCBI, 10 Nov. 2017, www.ncbi.nlm.nih.gov/pmc/articles/PMC6532668/. Accessed 22 Feb. 2021.
8. Knutsen, Ashleen. “A New Human Reference Genome Represents the Most Common Sequences.” TheScientist, 23 Dec. 2020, www.the-scientist.com/news-opinion/a-new-human-reference-genome-represents-the-most-common-sequences-68308. Accessed 22 Feb. 2021.
9. “Lack of Diversity in Genomic Research Hinders Precision Medicine for Nonwhite Americans.” he University of North Carolina at Chapel Hill, 19 June 2019, www.unc.edu/posts/2019/06/19/lack-of-diversity-in-genomic-research-hinders-precision-medicine-for-nonwhite-americans/. Accessed 22 Feb. 2021.
10. Nongmaithem, Suraj S., et al. “GWAS Identifies Population-specific New Regulatory Variants in FUT6 Associated with Plasma B12 Concentrations in Indians.” PubMed, NCBI, 1 July 2017, pubmed.ncbi.nlm.nih.gov/28334792/. Accessed 22 Feb. 2021.
11. Sherman, Rachel M., et al. “Assembly of a Pan-genome from Deep Sequencing of 910 Humans of African Descent.” Nature Genetics, 19 Nov. 2018, Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Accessed 22 Feb. 2021.
12. Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” Cell, 19 Mar. 2019, www.cell.com/fulltext/S0092-8674(19)30231-4. Accessed 22
Genome testing can be the future of our healthcare— allowing us to identify and diagnose diseases in individuals before symptoms show, find underlying causes for diseases, and create new treatments. Increased genomic data could give doctors broader knowledge of genomic variants, in turn allowing them to give more personalized healthcare to patients (3). However, genome testing requires data on a diverse set of people in order to accurately diagnose them. Currently, genome testing is not conducted on all races equally, with 459 of 753 polygenic scoring studies before 2017 being on those of European ancestry, and only 15 on those of African ancestry (4). Lack of representation of minority groups in genome studies doesn’t only widen the healthcare gap between these groups, but it also negatively affects the healthcare of everyone regardless of ancestry; representation of a wider range of ancestries could benefit both genetic research and improve human health through allowing for more in depth studies of underlying risks for diseases found predominantly in minority groups as well as allow researchers to find treatments with a larger sample size.
Many researchers are currently studying different forms of genome testing, one of the most notable being genome-wide association studies (GWAS). These studies give polygenic risk scores (PRS), which are calculated based on the presence or absence of risk variants, which are located through studying other individuals. However, the lack of representation of those of African ancestry has led to them getting less accurate polygenic risk scores, as data derived from Africans give more accurate PRS for others of African Ancestry (4). If PRS do end up becoming used clinically, the already widening gap in health care of different races will increase even further(4,6). Since widening the gap of the quality of health care of different races isn’t ideal, PRS will probably not be used clinically until a more diverse database containing those of different races is developed. Until then, PRS will probably not be used clinically by anyone, causing those of all races to not be able to take advantage of the opportunities that they offer.
Even further implications on our healthcare are caused by this lack of genomic studies. One of these such examples involves pharmacogenomics— the identification of genes that influence one’s response to drugs. Since there has been little pharmacogenomic research on minority groups, they have an increased risk of unanticipated drug responses (2). We have already seen the possible positive effects of pharmacogenomic testing, with examples such as Ethiopia’s ban on codeine due to many Ethiopians containing genes that causes them to have respiratory problems or even death from the drug (1). If pharmacogenomics weren’t used in this case, it could lead to the pointless deaths of many Ethiopians. If pharmacogenomic testing isn’t used on a more diverse range of people, these uses can not be administered to their fullest extent.
Greater developments in testing can even improve healthcare and understanding of certain diseases for all people. For example, some rare forms of diseases are more commonly found in those of different races, such as triple negative tumors in breast cancer (tumors lacking estrogen receptors, progesterone receptors, and the HER2 receptor) being found more commonly in black women. Due to the lack of representation of black women in genome testing, scientists are unable to draw conclusions of external reasons for the development of triple negative tumors (5). Therefore, the lack of research on minority groups inhibits scientists’ ability to diagnose and understand these tumors and how they affect all races, decreasing the ability of our healthcare system to provide treatments and lower the number of cases. This also affects the effectiveness of organ donations. For example, greater prevalence of genes such as APOL1 in those of African ancestry has led to increased risk of kidney diseases. This ends up causing transplants from kidney donors of African descent having a lower chance of transplant survival, but when those of African descent who tested negative for the APOL1 gene donated, the rate remains the same (5). Therefore, if genome testing was used for a wider range of people, organ transplants could be safer for all people.
Generally, genome testing on a more diverse set group of people would give us a greater ability to diagnose and treat people within our healthcare system. This lack of representation inhibits the quality of our healthcare for all people, and the sole way we can improve this is through encouraging a more diverse population to engage in genomic research.
(1) Anberbir, Y. (2015, November 21). Ethiopia: Authority Issues Red Alert On Codeine Drug [Review of Ethiopia: Authority Issues Red Alert On Codeine Drug]. Allafrica.Com; The Reporter. https://allafrica.com/stories/201511241318.html#:~:text=The%20Ethiopian%20Food%20Medicine%20and,mainly%20used%20for%20pain%20relief
(2) Mersha, T. B., & Abebe, T. (2015). Self-reported race/ethnicity in the age of genomic research: its potential impact on understanding health disparities. Human Genomics, 9(1). https://doi.org/10.1186/s40246-014-0023-x
(3) Bilkey, G. A., Burns, B. L., Coles, E. P., Bowman, F. L., Beilby, J. P., Pachter, N. S., Baynam, G., J. S. Dawkins, H., Nowak, K. J., & Weeramanthri, T. S. (2019). Genomic Testing for Human Health and Disease Across the Life Cycle: Applications and Ethical, Legal, and Social Challenges. Frontiers in Public Health, 7. https://doi.org/10.3389/fpubh.2019.00040
(4) Duncan, L., Shen, H., Gelaye, B., Meijsen, J., Ressler, K., Feldman, M., Peterson, R., & Domingue, B. (2019). Analysis of polygenic risk score usage and performance in diverse human populations. Nature Communications, 10(1), 1–9. https://doi.org/10.1038/s41467-019-11112-0
(5) Smith, C. E., Fullerton, S. M., Dookeran, K. A., Hampel, H., Tin, A., Maruthur, N. M., Schisler, J. C., Henderson, J. A., Tucker, K. L., & Ordovás, J. M. (2016). Using Genetic Technologies To Reduce, Rather Than Widen, Health Disparities. Health Affairs, 35(8), 1367–1373. https://doi.org/10.1377/hlthaff.2015.1476
(6) Hostetter, M., & Klein, S. (2018, September 27). In Focus: Reducing Racial Disparities in Health Care by Confronting Racism | Commonwealth Fund. Commonwealthfund.Org; commonwealthfund. https://www.commonwealthfund.org/publications/newsletter-article/2018/sep/focus-reducing-racial-disparities-health-care-confronting
Thirty years after the Human Genome Project began and in the era of ancestry kits being freely available from companies like 23andme for $99 and CRISPR, it is surprising that the DNA of all people worldwide is not easily accessible in genetic databases. Currently, this lack of diversity, with about 80% of genomes derived from people with European ancestry, biases genomic findings. Consequently, misstatements of disease risks, misdiagnosis of disease, and design of therapeutics that do not work well in people of non-European, multi-ethnic, or admixed ancestry are all likely to be exacerbated alongside existing health inequalities in these populations, preventing access to potential knowledge that could benefit future genetic research and improve health for all.
Genome-Wide Association Studies (GWAS) is a critical method used in genetics research to associate genetic variations with diseases based on genomic data. Scientists use GWAS results to predict disease risk, develop clinical practice guidelines, and design effective treatments. Hence, the absence of sufficiently diverse genomes in GWAS reduces the efficacy of genomic studies, as “Only 2.4% of the individuals included in the GWAS catalog are of African ancestry” (Bentley et al., 2020, para. 3). This imbalance compounds existing health disparities, illustrated by the disproportionate number of deaths of African-American and Latino populations in the U.S. during the COVID-19 pandemic, which have arisen from both biomedical factors and social determinants of health (Tai et al., 2020). Using these skewed genomic findings to guide traditional and precision medicine can lead to incomplete or inaccurate clinical and public health practices (Sirugo et al., 2019).
Hence, the associations between genetic variants and disease, which scientists aim to accomplish with GWAS, may be inaccurate when translated to other populations. Researchers such as Kim et al. (2018) and Duncal et al. (2019) have demonstrated the need for caution in extrapolating GWAS results from one population to predict disease risks in another population. Care is needed because disease risk scores derived from European ancestry data have low predictive value for non-Europeans, especially people of African descent, due to African populations displaying greater genetic diversity (Duncan et al., 2019). In contrast, GWAS results from African populations have high predictive value across global populations (Kim et al., 2018). Researchers could miss genetic variants critical for the health of individuals of non-European if they are of low frequency or absent in European samples, such as the TBC1D4 gene, which scientists found to increase the susceptibility of Greenlandic Inuit tribes to type 2 diabetes (Manousaki et al., 2016). Similarly, scientists have recently found out that cystic fibrosis affects African ancestry individuals, exposing disparities in diagnosis and treatment (Stewart et al. 2016 and McGarry et al., 2021).
Including more ancestrally and geographically diverse populations in genetic and genomic studies will lead to the discovery of more genes and disease-variant associations, improving our understanding of the distribution of genetic variation and providing a complete picture of the genetic and environmental factors that cause disease. The result would entail developing more targeted prevention and treatment strategies, fulfilling the promise of precision medicine and avoiding “facilitating discoveries that will disproportionately benefit well-represented populations” (Bentley et al., 2017). The inclusion of African-Americans in genomic studies would have prevented African-Americans’ false positive misdiagnosis for hypertrophic cardiomyopathy (Manrai et al., 2016). Researchers obtained one of the most influential contributions to genetic and medical research from a black woman, Henrietta Lacks; scientists have used her cells (commonly known as HeLa cells) for cancer research and mapping genes to chromosomes, leading to the Human Genome Project (Samuel, 2017). Additionally, including globally-representative populations will improve GWAS risk scores’ predictive power (Cavazos et al., 2021). Certain benefits are already being realized, such as identifying and characterizing novel loci for complex disease risk and secondary signals from known loci (Wojcik et al., 2019). Currently, genomic initiatives such as the All of Us study in the U.S., the U.K. Biobank, H3Africa, the African Genome Variation Project, and GenomeAsia 100k aim to recruit participants from understudied populations, showing the importance of increasing genetic diversity to improve the quality of healthcare for all.
Although science research utilizing information from underrepresented populations, such as HeLa cells, has led to many important discoveries, scientists have historically taken such information without consent. By avoiding such practices, we can build trust and unity between populations while mitigating health disparity by providing adequate disease prevention and treatment to underserved communities.
Bentley, A. R., Callier, S., & Rotimi, C. N. (2017). Diversity and inclusion in genomic research: why the uneven progress? Journal of Community Genetics, 8(4), 255–266. https://doi.org/10.1007/s12687-017-0316-6
Bentley, A.R., Callier, S.L. & Rotimi, C.N. (2020). Evaluating the promise of inclusion of African ancestry populations in genomics. npj Genomic Medicine 5, 5. https://doi.org/10.1038/s41525-019-0111-x
Cavazos, T. B., & Witte, J. S. (2021). Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. HGG advances, 2(1), 100017. https://doi.org/10.1016/j.xhgg.2020.100017
Collins, F. S., Doudna, J. A., Lander, E. S., & Rotimi, C. N. (2021). Human Molecular Genetics and Genomics – Important Advances and Exciting Possibilities. The New England journal of medicine, 384(1), 1–4. https://doi.org/10.1056/NEJMp2030694
Duncan, L., Shen, H., Gelaye, B., Meijsen, J., Ressler, K., Feldman, M., Peterson, R., & Domingue, B. (2019). Analysis of polygenic risk score usage and performance in diverse human populations. Nature communications, 10(1), 3328. https://doi.org/10.1038/s41467-019-11112-0
Kim, M. S., Patel, K. P., Teng, A. K., Berens, A. J., & Lachance, J. (2018). Genetic disease risks can be misestimated across global populations. Genome biology, 19(1), 179. https://doi.org/10.1186/s13059-018-1561-7
Manousaki, D., Kent, J. W., Jr, Haack, K., Zhou, S., Xie, P., Greenwood, C. M., Brassard, P., Newman, D. E., Cole, S., Umans, J. G., Rouleau, G., Comuzzie, A. G., & Richards, J. B. (2016). Toward Precision Medicine: TBC1D4 Disruption Is Common Among the Inuit and Leads to Underdiagnosis of Type 2 Diabetes. Diabetes care, 39(11), 1889–1895. https://doi.org/10.2337/dc16-0769
Manrai, A. K., Funke, B. H., Rehm, H. L., Olesen, M. S., Maron, B. A., Szolovits, P., Margulies, D. M., Loscalzo, J., & Kohane, I. S. (2016). Genetic Misdiagnoses and the Potential for Health Disparities. The New England journal of medicine, 375(7), 655–665. https://doi.org/10.1056/NEJMsa1507092
McGarry, M. E., & McColley, S. A. (2021). Cystic fibrosis patients of minority race and ethnicity less likely eligible for CFTR modulators based on CFTR genotype. Pediatric pulmonology, 10.1002/ppul.25285. Advance online publication. https://doi.org/10.1002/ppul.25285
Morales, J., Welter, D., Bowler, E. H., Cerezo, M., Harris, L. W., McMahon, A. C., Hall, P., Junkins, H. A., Milano, A., Hastings, E., Malangone, C., Buniello, A., Burdett, T., Flicek, P., Parkinson, H., Cunningham, F., Hindorff, L. A., & MacArthur, J. (2018). A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome biology, 19(1), 21. https://doi.org/10.1186/s13059-018-1396-2
Prohaska, A., Racimo, F., Schork, A. J., Sikora, M., Stern, A. J., Ilardo, M., Allentoft, M. E., Folkersen, L., Buil, A., Moreno-Mayar, J. V., Korneliussen, T., Geschwind, D., Ingason, A., Werge, T., Nielsen, R., & Willerslev, E. (2019). Human Disease Variation in the Light of Population Genomics. Cell, 177(1), 115–131. https://doi.org/10.1016/j.cell.2019.01.052
Samuel, L. (2017, April 13). 5 important ways Henrietta Lacks changed medical science. STAT. https://www.statnews.com/2017/04/14/henrietta-lacks-hela-cells-science/comment-page-2/.
Sirugo, G., Williams, S. M., & Tishkoff, S. A. (2019). The Missing Diversity in Human Genetic Studies. Cell, 177(1), 26–31. https://doi.org/10.1016/j.cell.2019.02.048
Stewart, C., & Pepper, M. S. (2017). Cystic Fibrosis in the African Diaspora. Annals of the American Thoracic Society, 14(1), 1–7. https://doi.org/10.1513/AnnalsATS.201606-481FR
Tai, D., Shah, A., Doubeni, C. A., Sia, I. G., & Wieland, M. L. (2021). The Disproportionate Impact of COVID-19 on Racial and Ethnic Minorities in the United States. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America, 72(4), 703–706. https://doi.org/10.1093/cid/ciaa815
Wojcik, G. L., Graff, M., Nishimura, K. K., Tao, R., Haessler, J., Gignoux, C. R., Highland, H. M., Patel, Y. M., Sorokin, E. P., Avery, C. L., Belbin, G. M., Bien, S. A., Cheng, I., Cullina, S., Hodonsky, C. J., Hu, Y., Huckins, L. M., Jeff, J., Justice, A. E., Kocarnik, J. M., … Carlson, C. S. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature, 570(7762), 514–518. https://doi.org/10.1038/s41586-019-1310-4
Genomic research is revolutionizing medicine, but its potential to change lives is significantly hindered by the homogeneity of current data. The sequencing of more than three billion DNA base pairs during the Human Genome Project ushered in a radical new era of science—one that relies on identifying patterns across human genomes to preserve and predict population health. Specifically, genome-wide association studies (GWAS)—which search for genetic markers predictive of disease—have become a valuable tool for understanding the genetic underpinnings of many illnesses. The applicability of these studies, however, is hampered by their reliance on data that does not reflect the vast diversity of human populations (13). The lack of representation in genomic databases presents an imminent concern for groups of non-European ancestry; therefore, promoting greater inclusivity in research will allow genomic medicine to become a safer, more dependable reality for all.
As it stands, genomic research is heavily skewed towards those with European ancestry, because GWAS are typically calibrated using data from these individuals. A GWAS searches for genetic markers that consistently appear in people with a disease; because DNA bases that are close to one another are more likely to be inherited together, researchers can infer the location of disease-causing gene(s) based on their proximity to these markers. However, this approach can be problematic when searching for markers across ethnic groups, because the genomes of each population have changed due to migration, mutations, and evolution (6).
While genetic variation between humans is only a fraction of one percent, failing to use diverse datasets has far-reaching implications for people whose ethnicities are not represented equally. The absence of “diversity in human genomic studies means that our ability to translate genetic research into clinical practice… may be dangerously incomplete, or worse, mistaken” (12). Drawing conclusions based on uniform data can compromise the health of minority populations whose genetic profiles might seem abnormal, when really they are just underrepresented. A 2016 study highlighted this risk when it found that several people of African American ancestry were misdiagnosed with cardiomyopathy after participating in targeted genetic testing. Though the individuals who received false positives had only benign variants in their genome, the occurrence of certain mutations was still much greater in Black Americans relative to their White counterparts, leading the test—which was fed homogenous data—to an erroneous diagnosis. Importantly, the study concluded that these misclassifications could have been averted by including Black Americans in the original cohorts used to design the test. The faulty results arising from genetic studies can cause irreparable harm, stress, and economic burdens for families, while also “engendering confusion and distrust [in the scientific community]” (9); therefore, diversifying genomic data is crucial for advancing health in all communities.
Additionally, broadening ethnic representation will “enhance the quality of gene-disease association research for everyone”(8), because diverse analyses lead researchers to novel patterns that would otherwise be indiscernible. For instance, one study in West African populations revealed that while the APOL1 allele increases the risk of renal disease, it likely also serves as a defense against sleeping sickness (1), explaining its increased prevalence in certain regions and ethnic groups. Studies like this demonstrate the importance of studying diverse populations to understand gene interactions in broader contexts.
Diversification in research has also driven life-saving innovations in medicine. After analyzing nonsense mutations in the genomes of African populations, for example, scientists observed that non-functional variants of the PCSK9 gene correspond to lower levels of LDL cholesterol; this, in turn, significantly reduces the risk of adverse cardiovascular events. This insight enabled the development of PCSK9 inhibitors, drugs that are now commonly prescribed to combat high cholesterol and heart disease. By including communities that had been neglected from previous studies in this analysis, researchers were able to grasp the “underlying biology” of a gene and translate that knowledge into a “drug with global utility” (6, 12).
As genome-wide studies begin to be implemented in clinical settings, equal representation is necessary to prevent healthcare disparities from widening, especially since “underrepresentation in genomic databases is paralleled by underuse of genetic services” (7). Moreover, it is a way for the scientific community to acknowledge and rectify the injustices that minority communities have endured for years, while building back the trust that allows science to transform lives for the better. The potential of genetic research is endless, and all individuals deserve to benefit from that potential equally. By making a more deliberate effort to increase diversity, we will truly reimagine medicine through genomics in an inclusive and equitable way.
Bentley, Amy R, et al. “Diversity and Inclusion in Genomic Research: Why the Uneven Progress?” Journal of Community Genetics, Springer Berlin Heidelberg, Oct. 2017, www.ncbi.nlm.nih.gov/pmc/articles/PMC5614884/.
“Bringing Diversity to Genomic Data.” AACC, www.aacc.org/cln/articles/2017/june/bringing-diversity-to-genomic-data-under-represented-ethnic-minorities.
Ganguly, Prabarna. “Putting Diversity Front and Center.” Genome.gov, 19 June 2019, www.genome.gov/news/news-release/Putting-diversity-front-and-center.
Korlach, Jonas. “We Need More Diversity in Genomic Databases .” Scientific American, Scientific American, 1 Mar. 2019, www.scientificamerican.com/article/we-need-more-diversity-in-genomic-databases/.
“Lack of Diversity in Genetic Research a Problem.” Fred Hutch, 19 June 2019, www.fredhutch.org/en/news/center-news/2019/06/lack-diversity-genetic-research-problem.html.
Lambert, Jonathan. “Human Genomics Research Has A Diversity Problem.” NPR, NPR, 21 Mar. 2019, www.npr.org/sections/health-shots/2019/03/21/705460986/human-genomics-research-has-a-diversity-problem.
Landry, Latrice G., et al. “Lack Of Diversity In Genomic Databases Is A Barrier To Translating Precision Medicine Research Into Practice.” Health Affairs, vol. 37, no. 5, 2018, pp. 780–785., doi:10.1377/hlthaff.2017.1595.
Manolio, Teri A. “Using the Data We Have: Improving Diversity in Genomic Research.” The American Journal of Human Genetics, vol. 105, no. 2, 2019, pp. 233–236., doi:10.1016/j.ajhg.2019.07.008.
Manrai, Arjun K, et al. “Genetic Misdiagnoses and the Potential for Health Disparities.” The New England Journal of Medicine, U.S. National Library of Medicine, 18 Aug. 2016, www.ncbi.nlm.nih.gov/pmc/articles/PMC5292722/.
News Center. “5 Questions: Genevieve Wojcik on the Need for Diversity in Genome-Based Studies.” News Center, www.med.stanford.edu/news/all-news/2019/06/genevieve-wojcik-on-the-need-for-diversity-in-genomic-studies.html.
“Opinion: Greater Diversity Is Needed in Human Genomic Data.” The Scientist Magazine®, www.the-scientist.com/critic-at-large/diversify-our-human-genomic-data-66308.
Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” Cell, vol. 177, no. 1, 2019, pp. 26–31., doi:10.1016/j.cell.2019.02.048.
Wu, Katherine J. “Lack of Diversity in Genetic Research Could Be Costing Us Our Health.” PBS, Public Broadcasting Service, 21 Mar. 2019, www.pbs.org/wgbh/nova/article/lack-diversity-genetic-research-could-be-costing-us-our-health/.
Genome sequencing and genetic testing are becoming increasingly important for individuals looking to discover and overcome health issues. While practices like personalized medicine, which rely on genetic testing, are beginning to thrive, not all groups are benefiting equally. Lack of diverse population representation in genomic research limits findings, fails to represent all individuals, and restricts potential benefits of a comprehensive understanding of genetic variations leading to diseases.
Pharmacogenomics is a type of personalized medicine that studies the effects of genetic factors on individual reactions to drugs. Our DNA codes for proteins, called enzymes, that are integral in metabolizing drugs. Genetic differences in our DNA may cause individuals to produce enzymes that metabolize drugs at varying rates. While individuals find drug therapies to be ineffective, the genetic basis for the effectiveness is often unknown for minority populations. The antiplatelet drug Clopidogrel, used commonly to help prevent heart attacks, strokes or severe chest pain, is not metabolized by, and is considered ineffective in, 75% of individuals of Pacific Island ancestry (5). In the Clopidogrel Versus Aspirin in Patients at Risk of Ischaemic Events (CAPRIE) study, 95% of participants were Caucasian. Of that 95%, there was only a 10-20% frequency of the *2 allele, while in Pacific Islanders, there is a 40-77% frequency (5). This variant is significant in individuals that metabolize Clopidogrel slowly or not at all. The lack of population representation of individuals with this allele variant resulted in no warning on the drug’s ineffectiveness, leaving Pacific Islanders to suffer preventable issues including acute myocardial infarction (5).
Similarly, minority populations are not receiving proper dosages because of incomprehensive testing. Scientists develop genetic testing by first using biological data to determine genetic markers, single nucleotide polymorphisms (SNPs), that contribute to certain diseases, and deem them clinically significant. They then develop test panels that screen patients for the SNPs and finally can identify if individuals have significant SNPs. Ethnic associated gene variants caused Africans and Europeans to metabolize warfarin, an anticoagulant, at different rates. There are different SNPs in the CYP2C9 gene with high prevalence in groups of different ancestries that are important in determining whether there is an efficacious response to warfarin. While the gene variants CYP2C9*2 and CYP2C9*3 are most common in Europeans, gene variants CYP2C9*5, *6, *8, and *11 are also significant, and more prevalent, in individuals of African ancestry. During the Clarification of Optimal Anticoagulation Through Genetics (COAG) trial, genotype paneling was used in the algorithm for warfarin dosing (1,2). Clinicians only genotyped subjects for the known CYP2C9*2 and CYP2C9*3 variants, thus the SNP’s that determine metabolism rates in Africans were not considered in conclusive dosing(1,2). Although this trial was diverse, with 27% of participants being African, the initial lack of information about the gene variants specifically significant to metabolism rates in Africans led genotyping panels to be useless for them. Africans are known to have a greater gene variation than Europeans, but this consideration was not made during the trial. This caused improper, potentially dangerous, dosing for individuals of African descent. The failure to compensate for the differences in genetic variants between ethnic populations resulted in Africans lacking almost all benefits from this research. Lack of information about genetics in minority populations, again, proved to be harmful.
In reality, America is becoming increasingly diverse; the “melting pot” metaphor is more representative now than ever. Genetic research limited to individuals of European ancestry will not allow for serious advancements in human health, drug therapies, or disease pathology, and will only continue to harm minority populations. Genetic databases and test panels need to be representative of the entire human population.
These indifferences are not because of race, but lack of awareness. Bias in medical textbooks, lack of racially motivated thinking and diagnosis in medicine, and biased population representation in genetic studies are all holding us back from advancing medical treatments and being able to effectively treat patients regardless of their ancestry. Minority populations are afraid to experience the same exploitation their communities have faced in the past, they may lack the financial needs, time or information to get involved, and don’t always feel welcomed in the world of medical research. A change needs to be made to allow these individuals to have trust and see results in their care. Everyone needs to actively listen and recognize the disadvantages that people of color experience in order to bridge the racial divide in the field of medicine.
1. Bumpus N. N. (2021). For better drugs, diversify clinical trials. Science (New York, N.Y.), 371(6529), 570–571. https://doi.org/10.1126/science.abe2565
2. Hernandez, W., Gamazon, E., Aquino-Michaels, K. et al. Ethnicity-specific pharmacogenetics: the case of warfarin in African Americans. Pharmacogenomics J 14, 223–228 (2014). https://doi.org/10.1038/tpj.2013.34
3. Oh, S. S., Galanter, J., Thakur, N., Pino-Yanes, M., Barcelo, N. E., White, M. J., de Bruin, D. M., Greenblatt, R. M., Bibbins-Domingo, K., Wu, A. H., Borrell, L. N., Gunter, C., Powe, N. R., & Burchard, E. G. (2015). Diversity in Clinical and Biomedical Research: A Promise Yet to Be Fulfilled. PLoS medicine, 12(12), e1001918. https://doi.org/10.1371/journal.pmed.1001918
4. Ortega, V. E., & Meyers, D. A. (2014). Pharmacogenetics: implications of race and ethnicity on defining genetic profiles for personalized medicine. The Journal of allergy and clinical immunology, 133(1), 16–26. https://doi.org/10.1016/j.jaci.2013.10.040
5. Wu, A. H., White, M. J., Oh, S., & Burchard, E. (2015). The Hawaii clopidogrel lawsuit: the possible effect on clinical laboratory testing. Personalized medicine, 12(3), 179–181. https://doi.org/10.2217/pme.15.4
The completion of The Human Genome Project in 2003 set off a race to interpret genetic sequence variations and their role in diseases (10). This research promises a future of personalized responses to ailments, greater specificity in physicians’ prognoses, and timely interventions. However, the field is facing a major roadblock. Representing only 16% of the global population, individuals of European descent have accounted for 80% of all genome-wide association study (GWAS) participants in recent years, and limited action has been taken to remedy this trend (8). Not only has this left researchers lacking valuable insight into rare, population-specific sequence variants, but it has also prevented people of non-European descent from reaping the benefits of rapidly developing genomics technologies. Diverse samples in genomic research are essential to maximize our knowledge of genetics, ensure effective drug prescriptions and combat existing disparities.
Eurocentric GWAS present a detrimental case of medical myopia. Resulting in false positives and missed critical variants, they hinder our understanding of genomics and disease (12). A series of GWAS identified a strong association between certain sequence variants and body mass index, type II diabetes, and lipid levels in European Americans (5). A later study found that 25% of the variants presented different strengths of association in a population of non-European ancestry (5). Researchers depend on these associations as stepping stones for causal investigations. They look to them to answer why a disease is occuring, what factors induce its onset, and how it can be stopped. The danger arises when the variants applying only to certain population groups are investigated. The sequences initially believed to be strongly linked to body mass index and type II diabetes did not hold up universally, and it is possible that our grasp of other diseases has been blinded by the same sample homogeneity.
Variants occurring at a low frequency in European populations are also often overlooked. When a genome-wide scan of a Greenlandic Inuit population took place in 2015, a single-nucleotide polymorphism with a frequency of 0.98 was identified as a strong height-determining factor (7). Its frequency of only 0.017 in Europeans is likely why GWAS previously disregarded its impact on height (7). These instances expose the shortcomings of eurocentric studies – they place a cap on our understanding of genetic implications in health and disease. On the other hand, studies performed with diverse samples enable comparison between populations, as well as highly personalized insights and care.
Though humans share 99.9% of our DNA, that minuscule 0.1% of heterogeneity opens the door to drastically different responses to treatments (11). Certain genes have been identified as influential in the drug-metabolism process, and the frequencies of their alleles vary by population (6). An example of this is CYP2D6, which codes for enzymes that metabolize everything from antidepressants and antiarrhythmics to opioids (4). Over 100 variants of this highly allelic gene have been observed in different populations, each one uniquely impacting an individual’s ability to metabolize drugs (4). Without diversity in pharmacogenetic research, there is added risk every time prescriptions are filled for people of colour. If the impact of a drug is only verified on a sample of largely European ancestry, its effectiveness for other populations is up in the air. This directly hinders the ability of the healthcare system to provide a high quality of care to everyone, regardless of race and ethnicity.
These shortcomings in GWAS exacerbate existing inequities in the medical world. Disparities in education and socioeconomic status overlap with disparities in genomic research and together worsen the quality of healthcare provided to non-European populations. In the United States, people of colour are less likely to have health insurance and have higher mortality rates for conditions like diabetes and heart disease (1,9). Furthermore, Hispanic, Black, and Native American nonelderly adults go without necessary care more frequently than White populations (3). The lack of diversity in genomic research widens this gap. Cultural groups that already face more obstacles when it comes to receiving high quality care are continuing to be left behind by studies that overlook the variants prevalent in their genes. Putting an end to these disparities in healthcare will take targeted genomic studies that investigate diseases, factors related to wellbeing, and drug metabolism in underrepresented populations.
Inclusion and representation in genome-wide association studies are essential to support sustainable health on a global scale. Until genomic research is conducted with diverse samples, the race to interpret genetic sequence variations and their role in disease will leave no victors.
1. Agency for Healthcare Research and Quality. (2019). 2018 National Healthcare Quality and Disparities Report. https://www.ahrq.gov/research/findings/nhqrdr/nhqdr18/index.html
2. Bentley, A. R., Callier, S., & Rotimi, C. N. (2017). Diversity and inclusion in genomic research: why the uneven progress? Journal of Community Genetics, 8(4), 255–266. https://doi.org/10.1007/s12687-017-0316-6
3. Artiga, S., & Orgera, K. (2019). Key Facts on Health and Health Care by Race and Ethnicity. Kaiser Family Foundation. https://www.kff.org/report-section/key-facts-on-health-and-health-care-by-race-and-ethnicity-introduction/
4. Bertilsson, L., Dahl, M.-L., Dalén, P., & Al-Shurbaji, A. (2002). Molecular genetics of CYP2D6: Clinical relevance with focus on psychotropic drugs. British Journal of Clinical Pharmacology, 53(2), 111–122. https://doi.org/10.1046/j.0306-5251.2001.01548.x
5. Carlson, C. S., Matise, T. C., North, K. E., Haiman, C. A., Fesinmeyer, M. D., Buyske, S., Schumacher, F. R., Peters, U., Franceschini, N., Ritchie, M. D., Duggan, D. J., Spencer, K. L., Dumitrescu, L., Eaton, C. B., Thomas, F., Young, A., Carty, C., Heiss, G., … Le Marchand, L. (2013). Generalization and Dilution of Association Results from European GWAS in Populations of Non-European Ancestry: The PAGE Study. PLoS Biology, 11(9), e1001661. https://doi.org/10.1371/journal.pbio.1001661
6. Desta, Z., Ward, B. A., Soukhova, N. V., & Flockhart, D. A. (2004). Comprehensive Evaluation of Tamoxifen Sequential Biotransformation by the Human Cytochrome P450 System in Vitro: Prominent Roles for CYP3A and CYP2D6. Journal of Pharmacology and Experimental Therapeutics, 310(3), 1062–1075. https://doi.org/10.1124/jpet.104.065607
7. Fumagalli, M., Moltke, I., Grarup, N., Racimo, F., Bjerregaard, P., Jorgensen, M. E., Korneliussen, T. S., Gerbault, P., Skotte, L., Linneberg, A., Christensen, C., Brandslund, I., Jorgensen, T., Huerta-Sanchez, E., Schmidt, E. B., Pedersen, O., Hansen, T., Albrechtsen, A., & Nielsen, R. (2015). Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science, 349(6254), 1343–1347. https://doi.org/10.1126/science.aab2319
8. Genetics for all. (2019). Nature Genetics, 51(4), 579–579. https://doi.org/10.1038/s41588-019-0394-y
9. Institute of Medicine. 2012. How far have we come in reducing health disparities?: Progress since 2000: Workshop summary. National Academies Press. https://doi.org/10.17226/13383
10. National Human Genome Research Institute. (n.d.) Human Genome Project Results. https://www.genome.gov/human-genome-project/results
11. National Human Genome Research Institute. (n.d.) Genetics vs. Genomics Fact Sheet. https://www.genome.gov/about-genomics/fact-sheets/Genetics-vs-Genomics
12. Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. Nature, 538(7624), 161–164. https://doi.org/10.1038/538161a