top of page

Unmasking The Human Genome

Authored by: Lauren Wilkes

Art by: Eileen Cho


As AI becomes further integrated into our society and, more specifically, into the healthcare industry, the question of how favorably AI will ultimately impact the medical field is imminent. When it comes to genome sequencing in the genetics and biomedical research fields, the latest AI software, AlphaGenome, shows extreme promise. AlphaGenome is the newest DNA sequence AI model that provides advanced functional predictions of genetic variants of uncertain significance (VUS) [1]. The human genome is vast, and while the biomedical field has progressed remarkably in terms of our knowledge of the scope and contents of the human genome, understanding what function rare genetic variants actually possess and what that means for our genome, as well as health and disease, is a viable unknown that AlphaGenome can play a significant role in addressing. AlphaGenome utilizes DNA input to predict molecular properties along with comparing potentially mutated to unmutated sequences [1], to provide the most comprehensive predictions for unknown variants, offering promising assistance to the field of disease biology.


An understanding of the potential of AlphaGenome is contingent upon a general understanding of the function and role of the human genome. The human genome is the complete set of DNA in the human body, condensed and stored in the nucleus of our cells [2]. What is less well known is that the human genome does not solely consist of the “functional” or “coding” regions of our DNA, but rather it is composed of all genetic elements including those that are “non-coding,” or in other words, those that do not code for a specific functional protein. Approximately 98% of the human genome is comprised of genetic content that is non-protein coding. This includes regulatory elements such as enhancers and promoters, structural enhancers such as telomeres, transposable elements, and non-coding RNAs, all of which contribute in various ways to regulating gene expression [3]. Consequently, understanding and predicting functionality of every unique variant can prove to be difficult, particularly when these VUS occur in the non-coding areas. 


There have, however, been tremendous developments in researchers’ knowledge of the human genome due to work such as NIH’s, The Human Genome Project (HGP), which garnered the first successful sequence of the human genome [4]. From this strong launchpad, there have grown a multitude of avenues when it comes to analyzing variant functionality including high-throughput sequencing [5], single nucleotide polymorphism assays for identification of loci and further computational models for determination of function [6], also made possible through reference genomes (such as those from HGP) made readily available for comparative analysis through databases such as the UCSC genome browser. 


The main problem is when VUS are encountered. This happens when variants have been identified in a sequence, yet there is insufficient evidence to properly determine the function or health-based phenotypic effect of the variant. In the context of patients, this would mean there is drastic uncertainty in determining whether a specific VUS is benign or pathogenic [7].  This is where AlphaGenome enters the picture. AlphaGenome’s AI DNA sequence analysis technology is able to analyze DNA sequence input and identify an extensive amount of molecular and genetic properties. Property predictions can include anything from the amount of RNA produced to the distance between accessible DNA bases, bound proteins, and starting and ending location based on cell and tissue types [1]. Much of AlphaGenome’s extensive capabilities rely on its use of data sourced from genome projects such as ENCODE and FANTOM5, both genome consortia that have mapped much of the functional elements of the human genome throughout both human and model organism tissue and cells. Through this model’s training from reliable genomic databases, it is able to analyze DNA sequences and recognize short patterns across the genome sequence. It subsequently cross references and communicates information about those patterns to develop comprehensive predictions for the functionality and modality implicated in the genetic code. 


One of the most promising and integral facets of what could make  AlphaGenome so useful lies in the difference from its precursor model, AlphaMissense. This previous model performed the same function, but solely on variants in the protein-coding region of the genome [1]. While this was a meaningful investigation, the vast majority of the genome encapsulates non-coding regions. Thus, identifying harmful regulatory variants that are unknown or understudied in the non-coding regions of the genome makes AlphaGenome incredibly promising for the field of medical genetics. 


Patients deserve less unknowns when it comes to the pathogenicity of variants responsible for rare genetic diseases. Part of how medical professionals can achieve more transparency for patients is through greater understanding of these VUS in the research domain. AlphaGenome’s efforts aid researchers substantially in their efforts to identify effects and functional implications of rare VUS and in turn, translate more information on pathogenicity into medical interactions with patients. The field of genetics research has advanced tremendously, and it is precisely this progress that makes AlphaGenome’s potential to improve predictions of both harmful and benign genetic variants especially impressive. 



Works Cited

  1. AlphaGenome: AI for better understanding the genome. (2025, June 25). Google DeepMind. https://deepmind.google/discover/blog/alphagenome-ai-for-better-understanding-the-genome/

  2. Brown, T. A. (2002). The human genome. Genomes - NCBI Bookshelf. https://www.ncbi.nlm.nih.gov/books/NBK21134/

  3. What is noncoding DNA?: MedlinePlus Genetics. (n.d.). https://medlineplus.gov/genetics/understanding/basics/noncodingdna/

  4. The Human Genome Project. (n.d.). Genome.gov. https://www.genome.gov/human-genome-project

  5. Reuter, J. A., Spacek, D. V., & Snyder, M. P. (2015). High-throughput sequencing technologies. Molecular cell, 58(4), 586–597. https://doi.org/10.1016/j.molcel.2015.05.004

  6. Satam, H., Joshi, K., Mangrolia, U., Waghoo, S., Zaidi, G., Rawool, S., Thakare, R. P., Banday, S., Mishra, A. K., Das, G., & Malonia, S. K. (2023). Next-Generation Sequencing Technology: Current Trends and Advancements. Biology, 12(7), 997. https://doi.org/10.3390/biology12070997

  7. Variant of uncertain Significance (VUS). (n.d.). Genome.gov. https://www.genome.gov/genetics-glossary/Variant-of-Uncertain-Significance-VUS#:~:text=When%20analysis%20of%20a%20patient's,have%20the%20same%20health%20condition.


ree

Comments


©2023 by The Healthcare Review at Cornell University

This organization is a registered student organization of Cornell University.

Equal Education and Employment

bottom of page