DNA sequencing technology is helping scientists unravel questions that humans have been asking about animals for centuries. By mapping out animal genomes, we now have a better idea of how the giraffe got its huge neck and why snakes are so long. Genome sequencing allows us to compare and contrast the DNA of different animals and work out how they evolved in their own unique ways.
But in some cases we’re faced with a mystery. Some animal genomes seem to be missing certain genes, ones that appear in other similar species and must be present to keep the animals alive. These apparently missing genes have been dubbed “dark DNA”. And its existence could change the way we think about evolution.
My colleagues and I first encountered this phenomenon when sequencing the genome of the sand rat (Psammomys obesus), a species of gerbil that lives in deserts. In particular we wanted to study the gerbil’s genes related to the production of insulin, to understand why this animal is particularly susceptible to type 2 diabetes.
But when we looked for a gene called Pdx1 that controls the secretion of insulin, we found it was missing, as were 87 other genes surrounding it. Some of these missing genes, including Pdx1, are essential and without them an animal cannot survive. So where are they?
The first clue was that, in several of the sand rat’s body tissues, we found the chemical products that the instructions from the ‘missing’ genes would create. This would only be possible if the genes were present somewhere in the genome, indicating that they weren’t really missing but just hidden.
The DNA sequences of these genes are very rich in G and C molecules, two of the four ‘base’ molecules that make up DNA.
We know GC-rich sequences cause problems for certain DNA-sequencing technologies. This makes it more likely that the genes we were looking for were hard to detect rather than missing. For this reason, we call the hidden sequence “dark DNA” as a reference to dark matter, the stuff that we think makes up about 25 percent of the universe but that we can’t actually detect.
By studying the sand rat genome further, we found that one part of it in particular had many more mutations than are found in other rodent genomes.