Displaying posts categorized under

Bioinformatics

Wikipedia pages on protein function prediction

I just received an email from Julian Gough , one of last year’s CAFA participants. He started a Wikipedia initiative on protein function prediction, which are barely stubs at the moment. EDIT: He alerted me to the fact that protein function prediction has virtually no presence on Wikipedia. So all you protein function predictors out there, please contribute. Yes, [...]

Circumcision, preventing fraud, and icky toilets. You know you’re going to read this.

In no particular order or ranking, recent and not-so-recent articles from PLoS-1. The common thread (if any): I thought they were pretty cool in one way or another.   1. Men don’t tell the truth about their penis. No kidding? But this is somewhat more serious. It has been accepted for some time that male [...]

Short bioinformatics hacks: reading mate-pairs from a fastq file

If you have a merged file of paired-end reads, here is a quick way to read them using Biopython: from Bio import SeqIO from itertools import izip_longest # Loop over pairs of reads readiter = SeqIO.parse(open(inpath), “fastq”) for rec1, rec2 in izip_longest(readiter, readiter): print rec1.id # do something with rec1 print rec2.id # do something [...]

The Friedberg Lab is Recruiting Graduate Students

  The Friedberg Lab is recruiting graduate students, for both Master’s and Ph.D. WE ARE:  A dynamic young lab  interested in gene, gene cluster and genome evolution, understanding microbial communities and microbe-host interactions by metagenomic analyses, developing algorithms for understanding gene cluster evolution, and prediction of protein function from protein sequence and structure. YOU ARE: [...]

Friday fun story: extreme bug hunting on MIRA

MIRA is a really cool sequence assembly software, developed and maintained by Bastien Chevreux. MIRA has a large and active community, led by the funny and gracious Bastien, for whom no problem is too small, or too large. Recently MIRA seemed to have developed a stochastic bug, one of those which are a serious headache [...]

Postdoc positions available at Rutgers University

Postdoctoral Research Scientist Rutgers University Joint Appointment: Institute of Marine and Coastal Sciences, BioMaPS and Dept. of Biochemistry and Microbiology Two 2-3 year Postdoctoral Research Scientist positions are available. We are looking for young scholars with experience in the areas of computational biology. In the scope of this project, we will uncover how the metal-containing [...]

Of Mice and Men or: Revisiting the Ortholog Conjecture

I  have posted quite a few times before about the acquisition of new functions by genes. In many cases a gene is duplicated, and one of the duplicates acquires a new function. This is one basic evolutionary mechanism of acquiring new functions. Sometimes, gene duplication occurs within a species: part of the chromosome may be [...]

Short bioinformatics hacks: merging fastq files

So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need Biopython installed.   #!/usr/bin/env python from Bio import SeqIO import itertools import sys import os # Copyright(C) 2011 Iddo Friedberg # Released under Biopython [...]

Tweets from AFP/CAFA 2011

The AFP/CAFA 2011 meeting was held on July 15 and July 16. Yes, it was a huge success, and I’m not just saying that beacuse I am one of the organizers.  I will write up something more comprehensive soon; in the meantime, here are my tweets from the meeting. I am learning a lot about [...]

ISMB 2011 tweets

ISMB this year had quite a few twiterrers. Hashtag: #ISMB. I tried to collect all the #ISMB tweets, so I wrote my own twitter scavenger script, but it seems to go only 3 days back.  I am not sure if this is a Twitter feature, or something with the library I am using (tweepy) or [...]

CAFA Update

Nearly a year ago, I posted about the Critical Assessment of Function prediction with which I am involved. The original post from July 22, 2010 is in the block quote. After that, an update about the meeting which will be held in exactly 2 weeks. The trouble with genomic sequencing, is that it is too [...]

Crowdsourcing genomics

  Miami University has  joined the National Genomics Research Initiative (NGRI) offered by HHMI Science Education Alliance (SEA) in their Phage Genomics course. The students go directly into the lab, participating in an authentic research experience. In a full-year academic course they: isolate and characterize bacterial viruses from their local soil prepare the viral DNA [...]

Bio-Linux. Now available in the Cloud

For some time now, NERC has been providing us with Bio-Linux. If you don’t want to be bothered with installing all the essential bioinformatic software for your Ubuntu box, you can install Bio-Linux, either as a a Linux distro for installation from scratch, or as a set of packages for an already existing Debian or Ubuntu [...]

Function predictor? Submit your work to the CAFA meeting

  Last July I introduced CAFA: Critical Assessment of (Gene and Protein) Function Annotations. Recap: the number of genomic and metagenomic sequences is growing at a horrendous rate. We are inundated with sequence data, yet the fraction of useful information we can glean from these sequences is steadily decreasing. There are simply too many sequences, and they are [...]

You know your graduate student is frustrated when…

…you find this on the top of the paper pile on his desk: