Two decades after the Human Genome Project and Celera GenomicsA genome is the complete set of an organism's genetic material, while genomics is the study of genomes, investigating their evolution, structure, and function. More first sequenced the human genome, researchers have finally completed the missing 8% of the genome, thanks to advances in HiFi sequencing technology. This milestone results from the Telomere-to-Telomere (T2T) Consortium, a global collaboration of over 30 institutions, which published their findings on May 27, 2021.
The Breakthrough: T2T-CHM13
The newly completed genome, known as T2T-CHM13, adds approximately 200 million base pairs of novel sequences to the 2013 reference genome. This includes:
- 2,226 paralogous gene copies, 115 of which are protein-coding.
- Complete sequences of centromeric satellite arrays.
- The short arms of all five acrocentric chromosomes.
This achievement unlocks previously inaccessible regions of the genome, enabling new studies on variation and function in these areas.
HiFi Sequencing: The Technology Behind the Discovery
The team utilized HiFi sequencing technology from PacBio (Pacific Biosciences, California) and a cell line derived from a complete hydatidiform mole. This unique tissue type forms when a sperm fertilizes an egg without a nucleus, resulting in a cell with only one set of chromosomes. This approach eliminates the need to distinguish between maternal and paternal chromosomes, simplifying genome assembly.
HiFi sequencing stands out for its high accuracy and ability to produce long reads, which are essential for resolving complex genomic regions. This method was pivotal in producing a gap-free, haploid human genome assembly.
New to HiFi Sequencing? – Learn about this new paradigm in sequencing technology.
Implications of the T2T-CHM13 Genome
- Enhanced Reference Genome:
- The T2T-CHM13 genome removes the gaps and errors present in the previous GRCh38 reference, which had left 8% of the genome inaccessible to sequence-based studies for over 20 years.
- This includes all centromeric regions and the short arms of human chromosomes, areas crucial for understanding structural variation and genome function.
- Advancing Genomic Research:
- With the newly completed genome, researchers can conduct more comprehensive studies on previously unexplored genomic regions.
- The data lays the groundwork for improved functional studies and genetic analyses.
- Future Challenges:
- Around 3% of the genome still presents unresolved complexities, particularly in a few chromosomal regions.
- The T2T-CHM13 genome lacks the Y chromosome, which is critical for understanding male-specific development and genetic disorders.
Key Insights from the Study
- Innovative Approach: Researchers simplified the sequencing process and resolved complex genomic regions by using a hydatidiform mole-derived cell line.
- Limitations: Quality controlQuality control (QC) refers to a series of activities and measures conducted on individual laboratory tests or analyses to verify and ensure the accuracy and reliability of the results. QC is a reactive approach that aims to identify and rectify issues or defects in the specific test or measurement being performed. Here are key points about quality control: • Test-Specific: QC focuses on the accuracy and precision of individual laboratory tests or analytical procedures. It assesses the performance of each test separately.
• Verification of Results: The primary goal of QC is to verify that the results obtained from a specific test or measurement are accurate and reliable. It checks whether the data generated meet established quality criteria.
• Identification of Problems: QC activities are designed to identify problems or errors in the test run as they occur. It allows for immediate corrective actions to be taken.
• Monitoring Consistency: QC measures aim to ensure the consistency of results over time. By tracking and comparing results from different runs or batches, laboratories can detect deviations from expected values.
• Tools and Controls: QC may involve the use of various tools and controls, such as blanks, internal standards, negative controls, and positive controls. These are employed to assess the accuracy and precision of the test for a specific batch or set of samples.
• Data Analysis: QC data are analyzed to determine if the test results fall within predefined acceptance criteria or quality specifications. Deviations from these criteria trigger further investigation.
• Maintenance and Calibration: Routine calibration and maintenance of laboratory equipment and instruments are part of QC activities. Regular checks ensure that equipment is operating correctly.
• Troubleshooting: QC identifies issues or irregularities in real time, allowing laboratory personnel to troubleshoot problems and take corrective actions promptly.
• Compliance: QC procedures often involve compliance with established standards, protocols, and guidelines specific to the type of testing being conducted.
• Documentation: Accurate documentation of QC data, actions taken, and any deviations from expected results is crucial. Records provide a historical record of the test's performance.
• Comparison to Quality Assurance (QA): While QC is focused on individual tests and measurements, quality assurance (QA) encompasses a broader approach that addresses the overall quality of laboratory operations and processes. QA aims to prevent errors proactively and improve processes, whereas QC reacts to issues as they arise.
• Risk Mitigation: QC also contributes to risk mitigation by ensuring that individual tests meet established quality standards and criteria. It helps prevent incorrect or unreliable results from being reported.
In summary, quality control (QC) in a laboratory setting involves measures and activities to verify the accuracy and reliability of individual test results. It is a reactive approach that identifies and corrects issues specific to a particular test or analysis. QC is an essential component of ensuring the integrity and quality of laboratory data. More remains a challenge in certain areas, and the absence of the Y chromosome limits the genome’s applicability to male-specific studies. - Future Prospects: Researchers aim to extend their work to include the Y chromosome and refine error-prone areas.
Conclusion
Completing the human genome is a monumental step forward for genomics, made possible by HiFi sequencing and the collaborative efforts of the T2T Consortium. While challenges remain, this achievement paves the way for more accurate reference genomes and deeper insights into the complexities of human biology. The T2T-CHM13 genome sets a new standard for genomic research, marking the beginning of an exciting new era in the field.
FAQ Section
1. What is HiFi sequencing?
HiFi sequencing is an advanced sequencing technology that produces highly accurate long DNADNA, or Deoxyribonucleic Acid, is the genetic material found in cells, composed of a double helix structure. It serves as the genetic blueprint for all living organisms. More reads, making it ideal for resolving complex genomic regions.
2. What is the T2T-CHM13 genome?
T2T-CHM13 is the first gap-free, haploid human genome assembly, incorporating 200 million previously missing base pairs.
3. Why was a hydatidiform mole used in the study?
A hydatidiform mole contains a single set of chromosomes, simplifying genome assembly by eliminating the need to differentiate between maternal and paternal DNA.
4. What are the limitations of the T2T-CHM13 genome?
The genome does not include the Y chromosome, and certain regions still present challenges in quality control and error resolution.
5. What are the implications of this achievement?
The T2T-CHM13 genome enables comprehensive studies on previously inaccessible regions, advancing our understanding of genetic variation and function.
Publication reference: Nurk, S. et al. Preprint at bioRxiv https://doi.org/10.1101/2021.05.26.445798 (2021).