First complete, gapless sequence of a human genome

This helps answer basic biology questions about how chromosomes properly segregate and divide.

Share

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome. Besides, it left critical heterochromatic regions unfinished.

Analyses of the complete genome sequence will significantly add to our knowledge of chromosomes, including more accurate maps for five chromosome arms, which opens new lines of research. This helps answer basic biology questions about how chromosomes properly segregate and divide.

The Telomere to Telomere (T2T) consortium recently published a human genome’s first complete, gapless sequence. According to scientists, having a complete, gap-free sequence of the roughly 3 billion bases (or “letters”) in our DNA is critical for understanding the full spectrum of human genomic variation and understanding the genetic contributions to certain diseases.

The T2T consortium includes leadership from scientists at the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health; University of California, Santa Cruz; and the University of Washington, Seattle.

This complete genome sequence has been used to discover more than 2 million additional variants in the human genome. These studies provide more accurate information about the genomic variants within 622 medically relevant genes.

Eric Green, M.D., Ph.D., director of NHGRI, said, “Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint. This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which will empower genetic studies of human disease.”

This new human genome sequence is especially useful for studies that aim to establish comprehensive views of human genomic variation or how people’s DNA differs. It offers a detailed understanding of the genetic contributions to certain diseases and uses genome sequence as a routine part of clinical care in the future.

Many research groups have already started using a pre-release version of the complete human genome sequence for their research.

In 2000, the Human Genome Project produced about 92% of the genome. Thousands of researchers have developed better laboratory tools, computational methods, and strategic approaches to decipher complex sequences.

That last 8% includes numerous genes and repetitive DNA and is comparable in size to an entire chromosome. Scientists generated the complete genome sequence using a cell line with identical copies of each chromosome, unlike most human cells, which carry two slightly different copies.

Scientists noted, “Most of the newly added DNA sequences were near the repetitive telomeres (long, trailing ends of each chromosome) and centromeres (dense middle sections of each chromosome).”

Evan Eichler, a Ph.D. researcher at the University of Washington School of Medicine and T2T consortium co-chair, said, “Ever since we had the first draft human genome sequence, determine the exact sequence of complex genomic regions. I am thrilled that we got the job done. The complete blueprint will revolutionize how we think about human genomic variation, disease, and evolution.”

Over the past decade, two new DNA sequencing technologies emerged: 1. The Oxford Nanopore DNA sequencing method. 2. The PacBio HiFi DNA sequencing method.

Both techniques produce much longer sequence reads: 1. The Oxford Nanopore DNA sequencing method can read up to 1 million DNA letters in a single read with modest accuracy. 2. The PacBio HiFi DNA sequencing method can read about 20,000 letters accurately.

Using both techniques, scientists could generate the complete human genome sequence.

Karen Miga, Ph.D., a co-chair of the T2T consortium whose research group at the University of California, Santa Cruz is funded by NHGRI, said, “Using long-read methods, we have made breakthroughs in our understanding of the most difficult, repeat-rich parts of the human genome. This complete human genome sequence has already provided new insight into genome biology, and I look forward to the next decade of discoveries about these newly revealed regions.”

According to consortium co-chair Adam Phillippy, Ph.D., whose research group at NHGRI led the finishing effort, sequencing a person’s entire genome should get less expensive and more straightforward in the coming years.

Phillippy said“In the future when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to better guide their healthcare. Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can see everything, we are one step closer to understanding what it all means.”

Journal References:

  1. Sergey Nurk, Sergey Koren, Arang Rhie et al. The complete sequence of a human genome. Science, 2022; 376 (6588): 44 DOI: 10.1126/science.abj6987
  2. Ariel Gershman, Michael E. G. Sauria et al. Epigenetic patterns in a complete human genome. Science, 2022; 376 (6588) DOI: 10.1126/science.abj5089
  3. Mitchell R. Vollger, Xavi Guitart, Philip C. Dishuck et al. Segmental duplications and their variation in a complete human genome. Science, 2022; 376 (6588) DOI: 10.1126/science.abj6965
  4. Savannah J. Hoyt, Jessica M. Storer, Gabrielle A. Hartley et al. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science, 2022; 376 (6588) DOI: 10.1126/science.abk3112
  5. Sergey Aganezov, Stephanie M. Yan et al. A complete reference genome improves analysis of human genetic variation. Science, 2022; 376 (6588) DOI: 10.1126/science.abl3533
  6. Nicolas Altemose, Glennis A. Logsdon et al. Complete genomic and epigenetic maps of human centromeres. Science, 2022; 376 (6588) DOI: 10.1126/science.abl4178

Trending