Displaying posts tagged with


Mozilla does scientific matchmaking between programmers and researchers

Mozilla Science Labs are looking top pair programmers and scientists. If you are a scientist in need of a programmer,  read the following, and then go to the website to see how to take it further. Thanks to Miami University’s Office for Advancement of Research and Scholarship  for bringing this to my attention.   Interdisciplinary Programming is […]

Friday Odds and Ends

So things have been busy in non-blog land. Putting together a tenure packet, some travel, teaching, and oh yes, even science. So no insightful post here, just some odds and ends I collected, in no particular order: There are quite a few species named after famous people: alive, dead, real or fictional.  Wikipedia has a […]

Wasting time with Google Trends

  It seems like the forces of light have triumphed somewhere around September 2006: …as have their evil counterparts, April 2009:     bacteria are neck-in-neck with humans:     But they beat the largest creatures on Earth:     Of course, you can’t beat cats:      

Stupid Python tricks, #3296: sorting a dictionary by its values

Suppose you have a dictionary mydict, with key:value pairs mydict = {‘a’:5, ‘b’:2, ‘c’:1, ‘d’:6} You want to sort the keys by the values,  maintaining the keys first in a list of tuples, so that the final list will be: [(‘c’,1), (‘b’,2), (‘a’,5), (‘d’,6)] aaaand, the stupid Python trick involves a nested list comprehension: sorted_list […]

A bit more on writing bioinformatic research code

There has been a lot of discussion recently on this blog and others on the need for robust scientific software. Most of the discussion I have been involved in comes from bioinformaticians, because, well, I am one. There has been plenty of talk about code robustness, sharing, and replicability vs. reproduciblity. I do not want […]

ROSALIND: an addictive bioinformatics learning site

  I just learned about this one: ROSALIND  is a really cool concept in learning bioinformatics. You are given problems of increasing difficulty to solve. Start with nucleotide counting (trivial) and end with genome assembly (not so trivial). To solve a problem, you download a sample data set, write your code and debug it. Once […]

The genomics programming language

Genomics is a new and exciting programming language based on Brainfsck. Here are the commands: g Move pointer to the right. e Move pointer to the left. n Increment the cell at the pointer. o Decrement the cell at the pointer. m Jump forward past the matching i if the cell at the current pointer […]

Short bioinformatics hacks: reading mate-pairs from a fastq file

If you have a merged file of paired-end reads, here is a quick way to read them using Biopython: from Bio import SeqIO from itertools import izip_longest # Loop over pairs of reads readiter = SeqIO.parse(open(inpath), “fastq”) for rec1, rec2 in izip_longest(readiter, readiter): print rec1.id # do something with rec1 print rec2.id # do something […]

Brainf**k while waiting for a flight

Warning: NSFW language. Brainfuck is a Turing-complete programming language consisting of eight commands, each of which is represented as a single character. > Increment the pointer. < Decrement the pointer. + Increment the cell at the pointer. – Decrement the cell at the pointer. . Output the ASCII value of the cell at the pointer. […]

The Friedberg Lab is Recruiting Graduate Students

  The Friedberg Lab is recruiting graduate students, for both Master’s and Ph.D. WE ARE:  A dynamic young lab  interested in gene, gene cluster and genome evolution, understanding microbial communities and microbe-host interactions by metagenomic analyses, developing algorithms for understanding gene cluster evolution, and prediction of protein function from protein sequence and structure. YOU ARE: […]

Short bioinformatics hacks: merging fastq files

So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need Biopython installed.   #!/usr/bin/env python from Bio import SeqIO import itertools import sys import os # Copyright(C) 2011 Iddo Friedberg # Released under Biopython […]

You know your graduate student is frustrated when…

…you find this on the top of the paper pile on his desk:

The open source spammer: extracting email addresses from an openoffice.org document

I’m organizing a workshop later this month (see here, scroll to session V), and I have just received the attendees list from the main conference’s organizers. Since I need to spam send the attendees informative email on the specific workshop, I needed their email addresses. Here’s what I did. The file itself is MS Word […]

Short bioinformatic hacks: reading between the genes

In celebration of the biohackathon happening now in Tokyo, I am putting up a script that is oddly missing from many bioinformatic packages: extracting intergenic regions. This one was written together with my student, Ian. As for the biohackathon itself, I’m not there, but I am following the tweets and  Brad Chapman’s excellent posts: Day […]

Real programmers use…

A nice take on the vi / emacs wars Also, real programmers browse the web using the vimperator.