Now that’s a f***ing big genome!
It isn’t junk DNA: God just commented out a lot of crappy code as he rolled out releases.
— An old bioinformaticians’ joke
(Hey, I never said it was a funny joke…)
Why are some genomes so big? I mean, seriously. Why would the marbled lungfish with a genome weighing 132.83 picograms (pg) need an estimated 130,000,000,000 bp? It may have to do with that fact that these fish undergo metamorphosis, and the large developmental coding this could entail: some amphibians also have big genomes.. then again, some don’t. So the reason for the big lungfish genome is still a mystery.
Then there is the genome of Paris japonica, a rare plant whose genome weighs 152.23 pg, making its genome the largest known so far, at a whopping estimated 150,000,000,000 bp. (Humans have a genome size of
10,000,000,000 3,000,00,000 bp by comparison. Thanks for catching this error, Jason.) Large genomes do not seem to confer an advantage: in fact, plants with large genomes are at greater risk of extinction, are less adapted to living in polluted soils and are less able to tolerate extreme environmental conditions. Their cell-cycle is, of course longer, so they grow slower than plants with a small genome and perhaps also more errors are introduced during mitosis and meiosis. The nucleus size and consequently, the cell size are also bigger, at least in plants. But in their conclusions to the study published in the Botanical Journal of the Linnean Society the authors write that “We are still profoundly ignorant about why some genomes […] are so big and how they operate and function.”
Finally, there are viruses. Not exactly alive, but getting more so as we are discovering viruses with genomes sizes that rival those of bacteria and archaea. I have posted before about the Mimivirus: a virus infecting amoebas which is so large it has been mis-classified as a bacteria for a decade. At 1,181,404 nucleotides its genome may not seem like much compared with Paris japonica and the marble lungfish, but this genome is 100-1000 times larger than that of most known viruses. Mimivirus also has tRNA genes, which are used to assemble proteins and a viral parasite of its own, named Sputnik (“little companion”), all of which makes you wonder whether the working definition for viruses as non-living entities still holds.
This month, Matthias Fischer and his colleagues have described a large marine virus, with a genome of 730,000 bp of double stranded DNA. The virus infects a unicellular eukaryotic bacteria eater named Cafeteria roenbergensis. (Why the odd name for the host? “We found a new species of ciliate during a marine field course in Rønberg and named it Cafeteria roenbergensis because of its voracious and indiscriminate appetite after many dinner discussions in the local cafeteria.” Reminds me of Ali G saying that he will name his son after where he was conceived which would be “Langley Village”, with the full name being :’The bogs in KFC in Langley Village’). Hence, the virus infecting this creatively-named critter is the Cafeteria roenbergensis virus or CroV. The virus has some 544 predicted protein coding genes, with at least 274 of them expressed during infection. Among the goodies coded by CroV are transcription related genes, DNA repair genes, promoters, and tRNA. Fairly atypical to known viruses. Which again, begs the question: how much cellular machinery does a virus need to code in its genome to cross the border between life and non-life? Is that even a criterion, or should we also consider the lack of physiology? Still, the majority of genes in CroV, as in Mimivirus and in most known viruses have no similarity to those in “true” living things. Go figure.
Fischer, M., Allen, M., Wilson, W., & Suttle, C. (2010). Giant virus with a remarkable complement of genes infects marine zooplankton Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.1007615107
PELLICER, J., FAY, M., & LEITCH, I. (2010). The largest eukaryotic genome of them all? Botanical Journal of the Linnean Society, 164 (1), 10-15 DOI: 10.1111/j.1095-8339.2010.01072.x