Displaying posts categorized under


Lively discussion: how to cross-validate?

So today’s group meeting got a bit heated as Nafiz, Ashley, and Xiao touched on the finer points of how to cross validate. Machine learning people, your comments are welcome.  

Why scripting is not as simple as… scripting

If you haven’t read the transcript of Sean Eddy‘s recent talk “On High Throughput Sequencing for Neuroscience“, go ahead and read it. It’s full of many observations and insights into the relationships between computational and “wet” biology, and it is very well-written. I agree with many of his points, for example, that sequencing is not […]

Mozilla does scientific matchmaking between programmers and researchers

Mozilla Science Labs are looking top pair programmers and scientists. If you are a scientist in need of a programmer,  read the following, and then go to the website to see how to take it further. Thanks to Miami University’s Office for Advancement of Research and Scholarship  for bringing this to my attention.   Interdisciplinary Programming is […]

Friday Odds and Ends

So things have been busy in non-blog land. Putting together a tenure packet, some travel, teaching, and oh yes, even science. So no insightful post here, just some odds and ends I collected, in no particular order: There are quite a few species named after famous people: alive, dead, real or fictional.  Wikipedia has a […]

The Bio* projects: a history in graphs

Yesterday I received an email from Kristjan Liiva, a student at  RWTH Aachen University Germany. Kristjan has developed a really cool dashboard to analyze and visualize the development of collaborative OSS projects by mining their mailing lists and software repositories.  (If the link doesn’t work, try again later; the project is heavily under development). The […]

Automated Function Prediction: Submit your abstracts by Saturday

You have until Friday Saturday, April 20th to submit your abstracts to the Automated Function Prediction meeting, an ISMB 2013 Special Interest Group and CAFA: Critical Assessment of Function Annotations. Keynote speakers: Patricia Babbitt, University of California, San Francisco. Protein similarity networks: Identification of functional trends from the context of sequence similarity Alex Bateman, European Bioinformatics […]

Terrible advice from a great scientist

I am not inclined to write polemic posts. I generally like to leave that to others, while I take the admittedly easier route of waxing positive over various bits of cool science I find or hear about, and yes, occasionally do myself. But WSJ editorial from E.O. Wilson has irked me so much, I have […]

Wasting time with Google Trends

  It seems like the forces of light have triumphed somewhere around September 2006: …as have their evil counterparts, April 2009:     bacteria are neck-in-neck with humans:     But they beat the largest creatures on Earth:     Of course, you can’t beat cats:      

Stupid Python tricks, #3296: sorting a dictionary by its values

Suppose you have a dictionary mydict, with key:value pairs mydict = {‘a’:5, ‘b’:2, ‘c’:1, ‘d’:6} You want to sort the keys by the values,  maintaining the keys first in a list of tuples, so that the final list will be: [(‘c’,1), (‘b’,2), (‘a’,5), (‘d’,6)] aaaand, the stupid Python trick involves a nested list comprehension: sorted_list […]

A Belated Valentine’s Day Post

This is romantic!  So listen up! A 3D heart shape may be drawn using the following implicit function: Or, in Python: def  heart_3d(x,y,z): return (x**2+(9/4)*y**2+z**2-1)**3-x**2*z**3-(9/80)*y**2*z**3 Trouble is, there is no direct way of graphing implicit functions in Python. But anything can be found on Stack Overflow. Putting it all together: #!/usr/bin/env python from mpl_toolkits.mplot3d import […]

A bit more on writing bioinformatic research code

There has been a lot of discussion recently on this blog and others on the need for robust scientific software. Most of the discussion I have been involved in comes from bioinformaticians, because, well, I am one. There has been plenty of talk about code robustness, sharing, and replicability vs. reproduciblity. I do not want […]

DIGging into Images and Genomes

Our lab has a new project and website up. The project is BioDIG: Biological Database of Images and Genomes.  BioDIG lets you combine image data and genome data of, well, just about anything which you can make images and have a genome, or partial genomic information. You can upload your image, annotate (tag) parts of […]

A Synopsis of Career Paths in Bioinformatics

My previous post on ROSALIND, a bioinformatics learning site, got picked up by the  Slashdot community. A discussion came up on careers in Bioinformatics, and the Slashdot user rockmulle made some interesting observations on career paths in bioinformatics, which I have copied here. While brief and therefore omitting many important details (research at a university […]

ROSALIND: an addictive bioinformatics learning site

  I just learned about this one: ROSALIND  is a really cool concept in learning bioinformatics. You are given problems of increasing difficulty to solve. Start with nucleotide counting (trivial) and end with genome assembly (not so trivial). To solve a problem, you download a sample data set, write your code and debug it. Once […]

Short note on getting students busy

I recently read this post about lacunae in  Bioinformatics.  One complaint was: I know that documentation is a thankless task. But some parts of the Bio[Java|Perl|Python] libraries are described only as an API? This became apparent to me when I had to teach the libraries to students. What does this module do and why does it do […]