Displaying posts categorized under


Lively discussion: how to cross-validate?

So today’s group meeting got a bit heated as Nafiz, Ashley, and Xiao touched on the finer points of how to cross validate. Machine learning people, your comments are welcome.  

Why scripting is not as simple as… scripting

If you haven’t read the transcript of Sean Eddy‘s recent talk “On High Throughput Sequencing for Neuroscience“, go ahead and read it. It’s full of many observations and insights into the relationships between computational and “wet” biology, and it is very well-written. I agree with many of his points, for example, that sequencing is not […]

A Simple Genome Annotator?

A question to genome annotators out there. I need a simple genome annotator for annotating bacteriophage genomes in an undergraduate course. Until now, we used DNAMaster but for various reasons  I would like to move away from that. Here’s what  I need for class: 1. Annotate a single assembled linear chromosome, about 50,000 bp, 80-120 genes, no […]

Mozilla does scientific matchmaking between programmers and researchers

Mozilla Science Labs are looking top pair programmers and scientists. If you are a scientist in need of a programmer,  read the following, and then go to the website to see how to take it further. Thanks to Miami University’s Office for Advancement of Research and Scholarship  for bringing this to my attention.   Interdisciplinary Programming is […]

Friday Odds and Ends

So things have been busy in non-blog land. Putting together a tenure packet, some travel, teaching, and oh yes, even science. So no insightful post here, just some odds and ends I collected, in no particular order: There are quite a few species named after famous people: alive, dead, real or fictional.  Wikipedia has a […]

The Bio* projects: a history in graphs

Yesterday I received an email from Kristjan Liiva, a student at  RWTH Aachen University Germany. Kristjan has developed a really cool dashboard to analyze and visualize the development of collaborative OSS projects by mining their mailing lists and software repositories.  (If the link doesn’t work, try again later; the project is heavily under development). The […]

Squeezing DNA

The state of biology today:   Our main problem is turning these DNA data into useful information. Finding genes and other functional genomic element, characterizing them, understanding their function and their impact on Life – all these are challenges that will remain with us for a long time, and which have revolutionized biology into the […]

The allure of the superficial

A new paper from my lab and Patsy Babbitt’s lab in UCSF has recently been published  in  PLoS Computational Biology. It is something of a cautionary tale for quantitative biologists, especially  bioinformaticians and system biologists. Genomics has ushered biology into the  data rich sciences. Bioinformatics, developing alongside genomics, provided the tools necessary to decipher genomic […]

Automated Function Prediction: Submit your abstracts by Saturday

You have until Friday Saturday, April 20th to submit your abstracts to the Automated Function Prediction meeting, an ISMB 2013 Special Interest Group and CAFA: Critical Assessment of Function Annotations. Keynote speakers: Patricia Babbitt, University of California, San Francisco. Protein similarity networks: Identification of functional trends from the context of sequence similarity Alex Bateman, European Bioinformatics […]

Terrible advice from a great scientist

I am not inclined to write polemic posts. I generally like to leave that to others, while I take the admittedly easier route of waxing positive over various bits of cool science I find or hear about, and yes, occasionally do myself. But WSJ editorial from E.O. Wilson has irked me so much, I have […]

Wasting time with Google Trends

  It seems like the forces of light have triumphed somewhere around September 2006: …as have their evil counterparts, April 2009:     bacteria are neck-in-neck with humans:     But they beat the largest creatures on Earth:     Of course, you can’t beat cats:      

Stupid Python tricks, #3296: sorting a dictionary by its values

Suppose you have a dictionary mydict, with key:value pairs mydict = {‘a’:5, ‘b’:2, ‘c’:1, ‘d’:6} You want to sort the keys by the values,  maintaining the keys first in a list of tuples, so that the final list will be: [(‘c’,1), (‘b’,2), (‘a’,5), (‘d’,6)] aaaand, the stupid Python trick involves a nested list comprehension: sorted_list […]

A Belated Valentine’s Day Post

This is romantic!  So listen up! A 3D heart shape may be drawn using the following implicit function: Or, in Python: def  heart_3d(x,y,z): return (x**2+(9/4)*y**2+z**2-1)**3-x**2*z**3-(9/80)*y**2*z**3 Trouble is, there is no direct way of graphing implicit functions in Python. But anything can be found on Stack Overflow. Putting it all together: #!/usr/bin/env python from mpl_toolkits.mplot3d import […]

A bit more on writing bioinformatic research code

There has been a lot of discussion recently on this blog and others on the need for robust scientific software. Most of the discussion I have been involved in comes from bioinformaticians, because, well, I am one. There has been plenty of talk about code robustness, sharing, and replicability vs. reproduciblity. I do not want […]

DIGging into Images and Genomes

Our lab has a new project and website up. The project is BioDIG: Biological Database of Images and Genomes.  BioDIG lets you combine image data and genome data of, well, just about anything which you can make images and have a genome, or partial genomic information. You can upload your image, annotate (tag) parts of […]