The third phase of the ENCODE project offers new insights into the organization and regulation of our genes and genomes.
The Encyclopedia of DNA The Elements project (ENCODE) is a global effort to understand the workings of the human genome. With the end of its final phase, the ENCODE project has added millions of candidate “switches” of DNA from human and mouse genomes that appear to regulate when and where genes are activated, and a new registry that assigns a portion of these DNA switches. to useful biological categories. The project also offers new visualization tools to help with the use of large ENCODE datasets.
The latest results of the project were published Nature, accompanied by 13 additional in-depth studies published in other major journals. ENCODE is funded by the National Human Genome Research Institute, which is part of the National Institutes of Health.
“A key priority of ENCODE 3 was to develop means to share data from thousands of ENCODE experiments with the wider research community to help broaden our understanding of genome function,” said NHGRI Director Eric Green. MD, PhD. “ENCODE 3 search and visualization tools make this data accessible, and this is advancing in open science efforts.”
To evaluate the potential functions of different regions of DNA, ENCODE researchers studied biochemical processes that are typically associated with switches that regulate genes. This biochemical approach is an efficient way to quickly and comprehensively explore the entire genome. This method helps to locate regions in DNA that are “candidate functional elements”: regions of DNA that are predicted to be functional elements based on these biochemical properties. Candidates can be tested in other experiments to identify and characterize their functional roles in gene regulation.
“A key challenge for ENCODE is for different genes and functional regions to be active in different cell types,” said Elise Feingold, PhD, scientific advisor for the strategic implementation of the Genome Sciences division at NHGRI and leader in ENCODE for the Institute. “This means we have to test a large number and diversity of biological samples to work toward a catalog of functional elements that are candidates for the genome.”
Significant advances have been made in the characterization of protein-coding genes, comprising less than 2% of the human genome. Researchers know much less about the remaining 98% of the genome, including how much and what parts of it perform other functions. ENCODE helps to fill this important knowledge gap.
The human body is made up of billions of cells, with thousands of cell types. Although all of these cells share a common set of DNA instructions, different cell types (e.g., heart, lung, and brain) perform different functions using information encoded in DNA differently. . The regions of DNA that act as switches to activate or deactivate genes or to adjust the exact levels of gene activity, help drive the formation of different types of cells in the body and govern their functioning in health and diseases.
During the recently completed ENCODE phase, researchers performed nearly 6,000 experiments – 4,834 in humans and 1,158 in mice – to shed light on details of genes and their potential regulators in their respective genomes.
ENCODE 3 researchers studied the development of mouse embryonic tissues to understand the timeline of various genomic and biochemical changes that occur during mouse development. Mice, because of their genomic and biological similarity to humans, can help inform our understanding of human biology and disease.
These experiments in humans and mice were carried out in various biological contexts. The researchers looked at how chemical modifications in DNA, proteins that bind to DNA, and RNA (a sister molecule of DNA) interact to regulate genes. The results of ENCODE 3 also help to explain how variations in DNA sequences outside the protein-encoding regions can influence gene expression, even in genes located very far from the same particular variant.
“The data generated in ENCODE 3 significantly increases understanding of the human genome, “said Brenton Graveley, Ph.D., professor and chair of UCONN Health’s Department of Genetics and Genome Sciences.” The project has added tremendous resolution and clarity to previous data types, such as DNA-binding proteins and chromatin markers, and new data types, such as long-range DNA interactions and protein-RNA interactions. “
As a novelty, ENCODE 3 researchers created a resource detailing different types of DNA regions and the functions of corresponding candidates. A web-based tool called SCREEN allows users to view data that supports these interpretations.
The ENCODE Project began in 2003 and involves a broad collaborative research effort involving groups from across the United States and internationally, consisting of more than 500 scientists with extensive experience. It has benefited and has been based on decades of research on gene regulation conducted by independent researchers around the world. ENCODE researchers have created a community resource to ensure that project data is accessible to any researcher for their studies. These open science efforts have resulted in more than 2,000 publications from non-ENCODE researchers who used data generated by the ENCODE Project.
“This shows that the encyclopedia is widely used, which is what we had always pointed out,” Dr. Feingold said. “Many of these publications are related to human diseases, which demonstrate the value of the resource for relating basic biological knowledge to health research.”
Reference: Encyclopedia of DNA Elements Project (ENCODE)