Displaying posts categorized under


Going to GOA, pt. 2: children of a lesser GO

The source file associated with this post can be downloaded here. The last time I talked about how to read a GOA gene_associations file into a Python dictionary data structure.  Our goal was to find all genes that are annotated as hydrolases in the GOA gene_associations file. The tricky part is, most enzymes are not […]

Filling in the evolutionary blanks, genome by genome

After hearing Jonathan Eisen and Nikos Kyripdes talk about GEBA in various meetings, it is great to see the paper finally come out, and under a CC license too. Good move for everyone. GEBA is the Genomic Encyclopedia of Bacteria and Archaea. The idea is simple: we have >1000 prokaryotic genomes in GenBank as of […]

Gene and protein annotation: it’s worse than you thought

Sequencing centers keep pumping large amounts of sequence data into the omics-sphere (will I get a New Worst omics Word Award for this?)  There is no way we can annotate even a small fraction of those experimentally and indeed most  annotations are automatic, done bioinformatically. Typically function is inferred by homology: if the protein sequence […]

Going to GOA: pt. 1

GOA, the Gene Ontology Annotation, provide Gene Ontology annotation to proteins in UniProt. It also provides GO annotations to several genome projects: Chicken, Arabidopsis, Fly, Human, Mouse, Rat and Cow. Anyone working on any of those genomes, or on UniProt and is interested in annotation, would most likely need to query GOA once in a […]

Photosynthesis, phages and structures: there’s treasure everywhere!

Here’s a really cool work, published this September in Nature.. Why did I choose this work?  Well, it’s a major discovery, and it’s all done using bioinformatics, and fairly simple bioinformatics at that. The power of metagenomics and bioinfromatics: in a mass of data you just have to know what you are looking for, and […]

The Warren L. DeLano Memorial Award for Computational Biosciences

Warren DeLano passed away suddenly and at a young age at his home Nov 3, 2009. He was the author of PyMol, a very popular molecular visualization program, and a strong advocate of open source software. The family of Warren Lyford DeLano has created a “In Memorium” page and blog. Also, a memorial award is […]

The Genomic Ark: 10,000 vertebrate genomes

The first bioinformatics meeting I went to was in 1996 at the  Nachsholim resort,  north of Tel Aviv. I received a fellowship for the duration, and shared a room with the brilliant Golan Yona, then a grad student at the Hebrew University. I was doing biochemistry at the time and knew next to nothing about […]

Short Bioinformatics Hacks: Glimmer Splitter

Glimmer is a program that predicts ORFs in bacterial and archeal genomes. The input is the assembled genome FASTA file, the output are several files of the predictions in different stages. The terminal output file is the .predict file. which looks something like this: >NODE_1_length_38001_cov_935.551880 orf00001 481      362  -2     1.45 orf00002      451      567  +1     0.59 […]

A bioinformatician’s peeves (some of them)

As resident bioinformatician in many places over the years, I got many of requests to help. Anything from a short blast run to a full-fledged collaboration. I love that. I always like learning about new problems, and those requests may blossom into full research collaborations. So yes, drop me an email or step into my […]

The medium-rare biosphere

All the roots hang down Swing from town to town They are marching around Down under your boots All the trucks unload Beyond the gopher holes There’s a world going on Underground — Tom Waits, “Underground” Our picture of the microbial biosphere is heavily skewed towards what we can see, culture, and are interested in. […]

It ain’t necessarily so

First, a short glossary. Homologous genes are descended from a common ancestral gene. There are two types of homology: Orthology is homology due to a speciation event. So if there is a gene A’ in humans and A” in mice, and they are obviously similar in sequence, we infer that they homologous. We usually also […]

Weekly Poll: will you have your own genome sequenced?

CLARIFICATION: the events described here have not happened. Yet. We are a few years into the future. Whole human genomes can be sequenced relatively cheaply and accurately. Direct to Consumer Genomics companies offer true genomic analyses now, not just marker analyses. They BLAST* your sequence against known genotype & disease databases, looking for known genotypic […]

New: weekly poll

I will try to maintain a weekly poll on BsB, for matters biologick, bioinformatick, generally scientifick or otherick. As in any poll, if read too much into its questions or answers, you should seriously chill. That being said, comments are most welcome. The poll is on the sidebar that’a’way.—> (Scroll a bit down if you […]

The new natural history

Before the 20th century biology was, to a large extent, “Natural History”. It was an observational rather than the experimental science it is considered to be today. At that time, the typical biologist, a natural historian, was going about the (European colonized) world, collecting specimens of new and fossilized species, classifying and recording them for […]

The Craigslist of Antibiotic Resistance

(Before we get going: this the the 100th post on Byte Size Biology. Happy Birthday to me!) Resistance to antibiotics is a huge clinical problem. In the US, more people die of  methicillin-resistant Staphylococcus aureus (MRSA) infections (nearly 19,000 in 2006) than of AIDS (14,627).  We know that antibiotic resistance is carried on mobile genetic […]