Displaying posts tagged with


John Smith’s genome sequenced

Springville University’s Genome Center in collaboration with Prof. I. M. A. Bigschotte from IvyLeague University have announced that the genome of Mr. John Smith from Centertown, USA has been sequenced and is now available online. Dr. James Williams, director of the Center said: “We were running out of things to sequence, but I still had […]

Why it’s hard to assemble repetitive DNA regions

So here are EssOh and OhOne assembling a rather frustrating puzzle containing cows. The same 5-6 cow “characters” are repeated, which is a perfect way to illustrate low-complexity DNA sequences, and why they are hard to assemble, especially when the pieces are small, like those you get from some second generation sequencers.

The Assemblathon

The Genome Center at University of California Davis and researchers at UC Santa Cruz are  organizing a genome assembly competition which they call The Assemblathon. They have released two simulated genomes  for competing groups to assemble as best they can. Assemblies are due February 6th, 2011. So there is still time, if you would like […]

Strawberries, Chocolate and Open Access Genomics

Nature Genetics seems to have taken a page from the Food Network Magazine by timing two publications to the annual obsession with festive foods among many, NG readership included.  I am talking about the genomes of the Strawberry and of the Cocoa plants.  Both are important crops,  both are components of luxurious eating. Both papers are comprehensive […]

Personalized Medicine Poetry

The personal genomics company 23&me is hosting a poetry contest. The winner receives a free pass to the Personalized World Medicine Conference. Poems should include a bunch of keywords having to do with 23&me, personalized genomics and all that jazz. I’m no poet (and don’t you know it), so here is my Haiku non-entry: My genome was seq- […]

Making genomes less CAGI

cag·ey    /ˈkājē/ (adjective) Reluctant to give information owing to caution or suspicion CAGI /ˈkājē/ (acronym) Critical Assessment of Genomic Interpretations. For details keep reading. The ability to sequence one’s genome adds a new dimension to the ancient maxim “know thyself”. What could be more revealing of one’s self than one’s own blueprint, explaining existing […]

Now that’s a f***ing big genome!

It isn’t junk DNA: God just commented out a lot of crappy code as he rolled out releases. — An old bioinformaticians’ joke (Hey, I never said it was a funny joke…) Why are some genomes so big? I mean, seriously. Why would the marbled lungfish with a genome weighing 132.83 picograms (pg) need an […]

Lake Arrowhead Microbial Genomics Conference

Quick post: at the Lake Arrowhead Microbial Genomics Conference. I’m a bad microblogger, but thankfully Jonathan Eisen and Ruchira Datta are doing a great job of covering this conference live. There is a friendfeed room. The Twitter hashtag is #LAMG10.  The science, people, food and location are all great. My student, David Ream, is presenting […]

Protein Function: how do we know that we know what we know?

The trouble with genomic sequencing, is that it is too cheap. Anyone that has a bit of extra cash laying around, you can scrape the bugs off your windshield, sequence them, and write a paper. Seriously? Yes, seriously now: as we sequence more and more genomes, our annotation tools cannot keep up with them. It’s […]

Celebromics? HeavyMetalomics? Advertomics? Anniversomics!

René Goscinny would probably have done a better job of naming the new trend of personal genomics (genomix?) companies to sequence celebrities genomes. Heck, we might have even done Obelix’s  and Asterix’s genomes to find out  if Obelix can drink the magic potion without Getafix’s (Panoramix’s) admonishments that it might do him harm, or to […]

Computational Bridge to Experiments

A bit of background information: this is a meeting I am really happy to be part of, and even more so honored to be a co-organizer. One of my main scientific interests is the prediction of the function of genes and proteins of unknown function. Some background information: we have sequenced more than 1000 genomes […]

New poll: would you make your genome public?

Would you have your genome sequenced for free?  Conditions: you must license it for all use; a liberal CC-no attribution-like license which allows for commercial use as well. Also, your genome will be made public with many personal data  such as age, height, sex, weight, ethnicity, personal status (we want to find the “money making […]

Ancient Greenlander’s DNA reveals ugly mullet

Seriously, this is what I first thought when I saw the cover of this week’s Nature, and the associated drawings  in the press.  The dude’s haircut seems like it was even bad in the ’80s… 2080 BCE that is, which is when his body is dated. Approximately. A large group of researchers were involved in […]

Filling in the evolutionary blanks, genome by genome

After hearing Jonathan Eisen and Nikos Kyripdes talk about GEBA in various meetings, it is great to see the paper finally come out, and under a CC license too. Good move for everyone. GEBA is the Genomic Encyclopedia of Bacteria and Archaea. The idea is simple: we have >1000 prokaryotic genomes in GenBank as of […]

Short Bioinformatics Hacks: Glimmer Splitter

Glimmer is a program that predicts ORFs in bacterial and archeal genomes. The input is the assembled genome FASTA file, the output are several files of the predictions in different stages. The terminal output file is the .predict file. which looks something like this: >NODE_1_length_38001_cov_935.551880 orf00001 481      362  -2     1.45 orf00002      451      567  +1     0.59 […]