The Genomic Ark: 10,000 vertebrate genomes

The first bioinformatics meeting I went to was in 1996 at the  Nachsholim resort,  north of Tel Aviv. I received a fellowship for the duration, and shared a room with the brilliant Golan Yona, then a grad student at the Hebrew University. I was doing biochemistry at the time and knew next to nothing about bioinformatics, except that it seemed like an interesting thing to get into if you liked biology and programming. The meeting was great: Samuel Karlin, Pavel Pevzner, Dannie Durand, Temple Smith and Eugene Myers were there. Lots of down time on the beach and in the pub by the beach.  I learned an incredible amount in four days and by the time the meeting ended, I was hooked. I wrapped up my grad school work in biochemistry as a Master’s degree, and joined Hanah Margalit’s lab for a PhD in bioinformatics.

Dan Graur gave a talk at that meeting on The One True Phylogenetic Tree of Mammals. Dan’s talks are fast and funny. His tactic of building audience interest is by making them think they are missing something great if they even dare blink when he is talking;  it works. Dan was complaining that all genomic efforts were invested in inconsequential organisms such as humans, mice and Drosophila, and no one was interested in the Aardvark or Sloth genomes. He bemoaned the situation, as he needed the Aardvark, and a few thousand other mammalian species to get the “One True Tree”. Later that day, over dinner, Pavel Pevzner suggested sequencing the X chromosome from all mammals using the then-new DNA chip technology. The X chromosome being a “microgenome”, with no transposable elements from other chromosomes, making it a perfect candidate for being a proxy for a genome.

In 1996, capillary sequencing was well established, but still quite expensive,usable only by large institutions and companies.  DNA chips, however, were thought to become the next cheap sequencing technology, and there were many expectations that they would enable mass genomics. Chips turned out to be useful in many other applications, but not in mass sequencing. We had to wait almost 10 years for pyrosequencing  and other cheap mass sequencing technologies to hit the scene.

The cost of sequencing is still dropping exponentially, so fulfilling Dan’s wishes is very much in the making now. We are getting closer to getting the genomes, not only of all mammals, but of all vertebrates. The Genome 10K initiative was officially launched in April 2009. Today, the paper describing the project has been published in the Journal of Heredity. The goal is to collect and systematically sequence 10,000 vertebrate (not just mammalian) genomes. 10,000 is a nice round number, but looking at the paper, their actual aim is 16,203. Wow! That includes some recently extinct species for which genomic material may still be obtained like the Tasmanian Wolf.

Entry of the Animals Into Noah's Ark / Jan Breughel the Elder

Entry of the Animals Into Noah's Ark / Jan Breughel the Elder

Note that they do not plan to begin sequencing immediately. The cost of sequencing is still too high, and they are still waiting for costs to decrease to $2500 per genome, which is one-hundred times cheaper than it is today. But at the rate cost is dropping, they estimate that mass sequencing can be started in a few years. In the meantime, they are soliciting samples from the community.

A lot of effort for the True Tree… but it’s not only for that. It is the next logical step to take after completing the genome of a few select organisms. The library of life. To achieve an understanding of animal evolution on a level that in 1996 we could only  joke about. More information can be found on their site. Here is the closing paragraph from the article:

As the printing of the first book by Johannes Gutenberg altered the course of human history, so did the human genome project forever change the course of the life sciences with the publication of the first full vertebrate genome sequence. When Gutenberg’s success was followed by the publication of other books, libraries naturally emerged to hold the fruits of this new technology for the benefit of all who sought to imbibe the vast knowledge made available by the new print medium. We must now follow the human genome project with a library of vertebrate genome sequences, a genomic ark for thriving and threatened species alike, and a permanent digital record of countless molecular triumphs and stumbles across some 600 million years of evolutionary episodes that forged the “endless forms most beautiful” that make up our living world.

. (2009). Genome 10K: A Proposal to Obtain Whole-Genome Sequence for 10 000 Vertebrate Species Journal of Heredity DOI: 10.1093/jhered/esp086

Check Hayden, E. (2009). 10,000 genomes to come Nature, 462 (7269), 21-21 DOI: 10.1038/462021a

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Comments are closed.