Two decades after the Human Genome Project delivered the first draft human genome sequence, scientists have released the first full, gapless sequencing of a human genome. Researchers believe that possessing a complete, gap-free sequencing of our DNA’s around 3 billion bases (or “letters”) is crucial for comprehending the whole range of human genomic diversity and the genetic connections to particular disorders. The Telomere to Telomere (T2T) team led by experts from the National Human Genome Study Institute (NHGRI), part of the National Institutes of Health; the University of California, Santa Cruz; and the University of Washington, Seattle, carried out the research. The NHGRI was the study’s principal funder.
Analyses of the whole genome sequence will considerably improve our understanding of chromosomes, including more precise maps for five chromosomal arms, opening up new avenues of inquiry. This contributes to answering fundamental biology concerns concerning how chromosomes segregate and divide appropriately. The T2T collaboration utilized the now-complete genome sequence as a reference to find nearly 2 million new variations in the human genome. These studies give more precise information on the genetic variants found in 622 medically important genes.
“Generating a fully complete human genome sequence is a remarkable scientific achievement, delivering the first comprehensive picture of our DNA blueprint,” stated NHGRI director Eric Green, M.D., Ph.D. “This basic knowledge will boost the numerous continuing attempts to grasp all the functional subtleties of the human genome, empowering genetic investigations of human illness.”
The now-complete human genome sequencing will be especially useful for research aimed at establishing comprehensive perspectives of human genomic diversity, or how people’s DNA varies. Such discoveries are critical for understanding the genetic contributions to particular disorders and for the future use of genome sequencing as a standard aspect of clinical treatment. Many research organizations have already begun to use a pre-release version of the whole human genome sequence in their studies.
The whole sequencing builds on the work of the Human Genome Project, which mapped around 92 percent of the genome, and subsequent studies. To comprehend the intricate sequence, thousands of researchers have created improved laboratory instruments, computer methodologies, and strategic approaches. Six publications covering the whole sequence are published in Science, along with companion studies in numerous additional journals.
That remaining 8% contains multiple genes and repetitive DNA and is the size of a complete chromosome. The whole genome sequencing was created using a rare cell line that contains two identical copies of each chromosome, as opposed to typical human cells, which have two slightly different copies. The majority of the newly inserted DNA sequences were found around repeated telomeres and centromeres (the long, trailing ends of each chromosome) (dense middle sections of each chromosome).
“Determining the precise sequence of complicated genomic areas has been difficult ever since we received the first draft human genome sequence,” said Evan Eichler, Ph.D., a researcher at the University of Washington School of Medicine and T2T collaboration co-chair. “I am happy that we completed the task. The whole blueprint will change the way we think about human genetic diversity, illness, and evolution.”
The cost of sequencing a human genome utilizing “short-read” technology, which deliver several hundred bases of DNA sequence at a time, is just a few hundred dollars, and has dropped dramatically since the Human Genome Project’s conclusion. However, utilizing just these short-read approaches results in some gaps in completed genome sequences. The substantial decrease in DNA sequencing prices coincides with growing expenditures in new DNA sequencing technology to create longer DNA sequence reads without sacrificing accuracy.
Two novel DNA sequencing methods that yielded substantially longer sequence reads developed during the last decade. The Oxford Nanopore DNA sequencing technique can read up to 1 million DNA letters in a single read with reasonable precision, but the PacBio HiFi DNA sequencing method can read roughly 20,000 letters with near-perfect accuracy. To obtain the full human genome sequence, researchers in the T2T collaboration employed both DNA sequencing technologies.
“We have achieved advancements in our knowledge of the most challenging, repeat-rich portions of the human genome using long-read approaches,” says Karen Miga, Ph.D., a co-chair of the T2T collaboration whose research group at the University of California, Santa Cruz is sponsored by NHGRI. “This full human genome sequence has already yielded new insights into genome biology, and I eagerly await the next decade of findings regarding these newly disclosed areas.”
According to consortium co-chair Adam Phillippy, Ph.D., whose NHGRI research group spearheaded the final effort, sequencing a person’s full genome should become less costly and easier in the future years.
“In the future, when someone’s genome is sequenced, we’ll be able to detect all of the variations in their DNA and utilize that knowledge to better direct their treatment,” Phillippy said. “Completing the human genome sequence was like to putting on a new pair of spectacles. We’re one step closer to comprehending what it all means now that we can see everything clearly.”
Many early-career researchers and trainees, including those from Johns Hopkins University in Baltimore, the University of Connecticut in Storrs, the University of California in Davis, the Howard Hughes Medical Institute in Chevy Chase, Maryland, and the National Institute of Standards and Technology in Gaithersburg, Maryland, played critical roles. This achievement is reported in a bundle of six publications in today’s edition of Science, as well as companion studies in numerous other journals.