Understanding extensive genomic variation and its contribution to certain diseases requires a complete, gap-free sequence. The Telomere-to-Telomere (T2T) Consortium presents a human genome’s entire 3.055 billion base pair sequence.
The consortium includes collaborations from scientists at the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health; University of California, Santa Cruz; and the University of Washington, Seattle.
Examining the complete genome sequence will significantly enhance our understanding of chromosomes, including more accurate maps for five chromosome arms. This could reveal how chromosomes separate and divide.
Using the now-complete genome sequence as a reference, the T2T consortium has discovered more than 2 million additional variants in the human genome.
Eric Green, M.D., Ph.D., director of NHGRI, said, “Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint. This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which will empower genetic studies of human disease.”
This work is part of the Human Genome Project. To analyze complete genome sequence, scientists- until now, have built better laboratory tools, computational methods, and strategic approaches.
Six research papers describing the completed sequence appear in Science and other journals.
That last 8% includes numerous genes and repetitive DNA and is comparable in size to an entire chromosome. Researchers generated the complete genome sequence using a human cell line with only one copy of each chromosome, unlike most human cells, which carry two copies of each chromosome.
Scientists noted, “Most of the newly added DNA sequences were near the repetitive telomeres (long, trailing ends of each chromosome) and centromeres (dense middle sections of each chromosome).”
Evan Eichler, a Ph.D. researcher at the University of Washington and T2T consortium co-chair, said, “Ever since we had the first draft human genome sequence, determining the exact sequence of complex genomic regions has been challenging. I am thrilled that we got the job done. The complete blueprint will revolutionize how we think about human genomic variation, disease, and evolution.”
To generate the complete human genome sequence, scientists used both- the Oxford Nanopore DNA sequencing method and PacBio HiFi DNA sequencing method. The Oxford Nanopore DNA sequencing method can read up to 1 million DNA letters in a single read with modest accuracy. In comparison, the PacBio HiFi DNA sequencing method can read about 20,000 letters with nearly perfect accuracy.
Eichler said, “In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to guide their healthcare better. Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can see everything, we are one step closer to understanding what it means.”
Karen Miga, Ph.D., a co-chair of the T2T consortium whose research group at the University of California, Santa Cruz is funded by NHGRI, said, “Using long-read methods, we have made breakthroughs in our understanding of the most difficult, repeat-rich parts of the human genome. This complete human genome sequence has already provided new insight into genome biology, and I look forward to the next decade of discoveries about these newly revealed regions.”
Consortium co-chair Adam Phillippy, Ph.D., whose research group at NHGRI led the finishing effort, said, “Sequencing a person’s entire genome should get less expensive and more straightforward in the coming years. In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to guide their healthcare better. Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can see everything, we are one step closer to understanding what it means.”
- Nurk et al. The complete sequence of a human genome. Science 376. DOI: 10.1126/science.abj6987 (2022)
- Gershman et al. Epigenetic patterns in a complete human genome. Science 376. DOIi: 10.1126/science.abj5089 (2022)
- Vollger et al. Segmental duplications and their variation in a complete human genome. Science 376. DOI: 10.1126/science.abj6965 (2022)
- Hoyt et al. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science 376. DOI: 10.1126/science.abk3112 (2022)
- Aganezov et al. A complete reference genome improves analysis of human genetic variation. Science 376. DOI: 10.1126/science.abl3533 (2022)
- Altemose et al. Complete genomic and epigenetic maps of human centromeres. Science 376. DOI: 10.1126/science.abl4178 (2022)