Differences in the DNA sequence of individual genomes contribute to differences in many traits, such as appearance, physiology, and the risk for common diseases. An important group of these DNA variants influences how individual genes across the genome are turned on or off.
DNA variants that alter gene expression are an essential source of genetic variation for many traits, including diseases in humans.
The variants differed in their molecular mechanism, the type of genes, and their distribution in natural populations. Approximately 20 percent of Neanderthal variant (DNA) survives in modern humans. A study has suggested that a group of genes that reduces the risk of developing severe COVID-19 by around 20% is inherited from Neanderthals.
Identifying DNA variants is essential to know about your genetic risk for various health conditions. It can tell about your genetic risk for different health conditions.
While other tools, based on algorithms, have been developed to detect genetic variants, they provide incomplete information, especially for the VNTRs.
Scientists at the university’s Dornsife College of Letters, Arts, and Sciences come up with a better way to identify elusive DNA variants responsible for genetic changes affecting cell functions and diseases. Using computational biology tools, they studied ‘variable-number tandem repeats (VNTR) in DNA. VNTRs are stretches of DNA made of a short pattern of nucleotides repeated repeatedly, like a plaid pattern shirt.
The repetitive DNA governs how some genes are encoded, and proteins are produced in a cell and account for most of the structural variation.
Mark Chaisson, assistant professor of quantitative and computational biology and corresponding author of the study, said, “This type of repetitive DNA has been called ‘dark matter of the human genome because it has been difficult to sequence and analyze how it varies. We showed that variation in the dark matter could have a substantial effect on cellular processes, so future studies may use this approach to understand the genetic basis of disease and ways to improve our health.”
The new method can detect variants among different populations of people and affect gene expression, which helps discover links between VNTR variation and traits or disease.
The tool derives from a repeat-pangenome graph, a data structure that encodes population diversity and repetitions of VNTR locations on a chromosome to identify more gene sequences with better accuracy.
- Lu, TY., The Human Genome Structural Variation Consortium. & Chaisson, M.J.P. Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs. Nat Commun 12, 4250 (2021). DOI: 10.1038/s41467-021-24378-0