The Incredible Shrinking Genome
Mass Extinctions and GenomicsThe geological signs for mass extinctions are very distinct: the photo shows the boundary of the Cretaceous-Tertiary KT extinction that happened ~65 million years ago (Mya), and killed some 70% of the species on Earth, most famously the dinosaurs. This was the last mass extinction, and its effects on Earth's life is very clear and dramatic. Mammals have evolved and spread (radiated is the term used in evolutionary biology) to occupy many of the ecological niches dinosaurs have left vacant. The dinosaurs that remained are now birds (yes, superficial explanation, I know, but basically true), while one mammalian group, primates, have evolved an intelligence which ultimately lead to smartphones and blogging. Plant life has changed as well, with many more flowering plant species, and fewer ferns and conifers. The marks of the KT extinction are therefore found everywhere: in fossils, in geological records and in extant life. Everywhere? Can we also find marks of the KT extinction in genomes? A study that has been published recently claims so. The study was published in the first issue of a new open access journal, Genome Biology and Evolution by a group in Indiana University, Bloomington. To understand what they discovered, some background information is needed.
Mobile DNA Gain and LossOrganisms can acquire DNA from other organisms by inserting bits of foreign DNA, known as mobile DNA, into the genome. One way this is done is by viral infections. Some viruses integrate genomic material of their own, and sometimes of other host organisms into the hosts they infect. If those viruses happen to also infect germ cells – sperm or ova – those insertions or retrotransposons would be passed on to subsequent generations. It is quite easy to identify these viral insertions: they are flanked by characteristic DNA stretches called Long Terminal Repeats or LTRs. During the infection and insertion process, LTRs serve as “insertion hooks” if you will, allowing the the virus to insert its genome, and whatever other genomic elements from former hosts that happened to hitch a ride with the virus. Once the LTRs are in place in the host's genome, they serve no purpose. Over generations, the left-side LTR and the right-side LTR acquire mutations and drift away from their original sequences. For evolutionary biologists, paired LTRs thus serve as a molecular clock. Since the LTRs were identical at the time of insertion, the amount of dissimilarity between the paired LTRs can tell us how long ago they were inserted. Also, we can look at the total number of LTRs in a species and see how many are newly acquired LTRs, how many older. So we get a picture of LTR acquisition into the lineage leading to that species species over millions of years. Of course, after a very long time, the paired LTRs will not be recognizable as such, since they have diverged too far away from each other to be recognizable as a paired element. But we can recognize LTRs up to a divergence of 50%: a pretty high divergence rate.
Upon insertion: CCCAAAGGG-------------------CCCAAAGGG Generations later: CCGAATGGG-------------------CCCAGAGAG
Therefore, looking at a complete genome, we can see old LTRs, young LTRs, and many in-between. It is like looking an old house, which every new owner has decided to do something when they occupied it: this one added a patio, that one carved out a window, and a third has installed a porch swing. We can tell the window is fairly old because of its design, while the porch swing is new because the brand name did not exist five years ago.
LTRs are also lost, not just gained. There are many mechanisms for LTRs to be removed from a lineage: it may be lost by “fading out” through an accumulation of mutations, or by being excised from the genome through some loss of a section. An LTR may also have a deleterious effect, such as increasing the possibility of cancer, or decreasing he viability of the immune system. After all, an LTR is an uninvited guest in the genome, and we all know that uninvited guests are not the most desirable ones... therefore, those LTRs will be selected against in a Darwinian fashion, as they reduce their host's fitness. Using the house analogy, the owners may change the house to revert to its original design in some places, fill up the pool the previous owner has dug, or the porch swing may simply have been sold.
LTR loss rate vs. gain rate can be modeled statistically, and from that model the expected distribution of LTRs of different ages in the genome can be inferred. Basically the model states that we will see a distribution of many LTRs that have been gained fairly recently in the genome, and fewer and fewer older LTRs. This is because the probability of any single LTR being lost from the genome increases exponentially over time. Indeed, the authors looked at a fly (Drosophila), plant (Arabidopsis) and fish (Fugu) genomes. They found that the distribution of LTRs fit the expected statistical model quite well.
Loss of LTRs in Mammals
But when they looked at mammalian genomes, including those of primates, things became strange. Instead of seeing a predominantly young LTR population, they saw a middle-age population. Young LTRs were few, while there was and unexpectedly high number of old LTRs. Was it because mammalian genomes have been gaining less LTRs lately? Or was it because old LTRs were being lost at an excessive rate? A combination of both? Something else? And why is this only in mammals? And not even all mammals at that, because we do not see this anomaly in rodent lineages, for example.
When they looked closely at how long ago the LTR population has peaked, they discovered another weirdness: almost without exception, the peak (and subsequent decline) occurred just after the KT extinction. Of course, “just after” in evolutionary terms can mean one to five million years, but the association with the KT boundary was too clear to ignore. We know why there is an iridium-rich line in the hills of south Texas, with many dinosaur fossils below it but none above it: the iridium-rich meteor that devastated Earth left an indelible mark. But a sudden peaking and subsequent decline in this mobile DNA element in mammals was puzzling.
To check whether this phenomenon was due primarily to an increased rate of LTR loss or to a decreased gain the scientists looked at another type of mobile DNA element. An element that, unlike LTRs, does not fade and is rarely excised. This nuclear-mitochondrial gene or numt stems from is the slow migration of genes from mitochondrial genomes to nuclear genomes. Mitochondria are organelles existing in all animal and plant life that have their own, much reduced genomes: most of the genes encoding the proteins that are active in mitochondria were lost from the mitochondria and gained in the cell nucleus. But unlike most LTRs, these genes are essential; therefore mitochondrial gene loss from the nuclear genome is rare. Examining the mitochondrial gene insertion serves as a control: if there are more old mitochondrial elements than young ones, compared to non-mammals then that means that general acquisition rate of of mobile DNA elements in the genome is declining, and not because elements are being removed faster. They found that with mammalian numts, the rate of migration from the mitochondria to the nucleus has indeed slowed down.
So it seems that mammalian genomes have been purging themselves from mobile DNA elements just around the KT boundary, give or take a couple of million years. (Or rather: not taking in new elements). Why is that? One hypothesis is the selective advantage: mobile DNA elements can disrupt the genome, decreasing a host's fitness. But mammals have existed for millions of years before the KT extinction. According to the fossil records they were small carnivores: dinosaurs took up the large (and XXL) herbivore niches, and the large carnivore niches. Once they were gone, mammals started to radiate, fill those niches, and a whole new level of competition arose. The selective advantage of not having a genome encumbered by potentially damaging mobile DNA elements has probably become critical at this “be ye fruitful and multiply; bring forth abundantly in the earth, and multiply therein” stage. In effect, the genomes of mammals has been shrinking by removing mobile DNA elements, just after the KT boundary. And according to the model presented in this study, this process is still ongoing: mammalian genomes are not at an equilibrium size. Unlike flies, mammals are still cleaning up.
Mammals Rule: Time to Clean House Genome?
Many questions are left unanswered: why would this genome cleaning take place as a result of a sudden reshuffling of the evolutionary deck, and the opportunity that was given to mammals? (If indeed the shrinking genome is a consequence of the KT extinction). Why would mobile DNA elements become a detriment then, more than before the KT extinction when mammals lived in the shade of the dinosaurs? If this genome house cleaning and shrinking is a result of rapid speciation that followed the KT extinction, wouldn't we expect to see it in other groups besides mammals? To the authors' credit, the paper is written very cautiously, and the authors are very careful to present all caveats and controls they could muster. This makes it something of a long read, but a fascinating one at that.
This article has been slashdotted. Exercise extreme caution.
Rho, M., Zhou, M., Gao, X., Kim, S., Tang, H., & Lynch, M. (2009). Independent Mammalian Genome Contractions Following the KT Boundary Genome Biology and Evolution, 2009, 2-12 DOI: 10.1093/gbe/evp007