A bit of background information: this is a meeting I am really happy to be part of, and even more so honored to be a co-organizer. One of my main scientific interests is the prediction of the function of genes and proteins of unknown function. Some background information: we have sequenced more than 1000 genomes [...]
Geek alert: this post for coders. So you sequenced your genome, reached an optimally small number of contigs, they look sane, and now you would like to see what you need for the finishing stage. Namely, how many gaps you have and what are their sizes. UPDATE: “might just be worth clarifying this is for [...]
No, not the flesh-blood-and-feathers penguin, but rather Tux, the beloved mascot of the Linux operating system. Compared with Escherichia coli, the model organism of choice for microbiologists. We refer to DNA as “the book of life”; some geeks refer to it as the “operating system of life”. Just like in a computer’s operating system, DNA [...]
Combrex is an exciting new project at Boston University to bridge computational and experimental techniques to functionally annotate proteins. They are hiring, see below: JOB POST We are seeking to hire a creative computational scientist for a transformative project: COMBREX: A Computational Bridge to Experiments. The work will involve building a novel resource that combines [...]
Yes! Why should the evolution people have all the fun with their blog carnival? (After all, it is only a theory.) It’s time for bioinformaticians to show what we are made of, and to have a carnival of our own. Bio::blogs had a good run some time ago. I decided to reconnect what is hopefully [...]
Byte Size Biology will be hosting the first edition of the bioinformatics blog carnival. All you bioinformatics bloggers, submit your entries by Mar 9, 2010 23:59:03 EST. Note the 3 second extension I have already given. There will be no more deadline extensions, I’ve been generous enough as it is. The carnival will be posted [...]
In celebration of the biohackathon happening now in Tokyo, I am putting up a script that is oddly missing from many bioinformatic packages: extracting intergenic regions. This one was written together with my student, Ian. As for the biohackathon itself, I’m not there, but I am following the tweets and Brad Chapman’s excellent posts: Day [...]
The source file associated with this post can be downloaded here. The last time I talked about how to read a GOA gene_associations file into a Python dictionary data structure. Our goal was to find all genes that are annotated as hydrolases in the GOA gene_associations file. The tricky part is, most enzymes are not [...]
GOA, the Gene Ontology Annotation, provide Gene Ontology annotation to proteins in UniProt. It also provides GO annotations to several genome projects: Chicken, Arabidopsis, Fly, Human, Mouse, Rat and Cow. Anyone working on any of those genomes, or on UniProt and is interested in annotation, would most likely need to query GOA once in a [...]
In no particular order or context. No personal stuff and by no means a complete list: WordPress (like, duh). Wikipedia (default for looking up new stuff) Wikis in general (great lab management tool. Don’t need LIMS) Open Access Publishing and Creative Commons licensing. FLOSS licensing (90% of the software I use, and 100% of what [...]
Glimmer is a program that predicts ORFs in bacterial and archeal genomes. The input is the assembled genome FASTA file, the output are several files of the predictions in different stages. The terminal output file is the .predict file. which looks something like this: >NODE_1_length_38001_cov_935.551880 orf00001 481 362 -2 1.45 orf00002 451 567 +1 0.59 [...]
First, a short glossary. Homologous genes are descended from a common ancestral gene. There are two types of homology: Orthology is homology due to a speciation event. So if there is a gene A’ in humans and A” in mice, and they are obviously similar in sequence, we infer that they homologous. We usually also [...]





