Stupid Python tricks, #3296: sorting a dictionary by its values

Suppose you have a dictionary mydict, with key:value pairs

mydict = {'a':5, 'b':2, 'c':1, 'd':6}

You want to sort the keys by the values,  maintaining the keys first in a list of tuples, so that the final list will be:

[('c',1), ('b',2), ('a',5), ('d',6)]

aaaand, the stupid Python trick involves a nested list comprehension:

sorted_list = [(k,v) for v,k in sorted(
                 [(v,k) for k,v in mydict.items()]
                 )
              ]

To get a reverse sorted list:

[('d',6), ('a',5),('b',2),('c',1)]
[(k,v) for v,k in sorted(
   [(v,k) for k,v in mydict.items()],reverse=True
   )
]
Crikey. That's a stupid python if I ever held one!

Crikey. That’s a stupid python if I ever held one!

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The power of single-cell genomics: the mysterious SR1 bacteria have a unique genetic code

ResearchBlogging.org

Thanks to Mitch Balish for calling my attention to this one.

SR1 bacteria are not exactly a household name, even among microbiologists. They were first discovered in contaminated aquifers,  and since then they were found to be also in animal and insect guts, as well as in human mouths. They are even suspected of being a cause of periodontal disease.  I should probably say here that SR1 is a whole phylum of bacteria, and not a single genus or species. The reason that they are not that well known is that their discovery was fairly recent.

Also, no one has ever actually seen or grown SR1.

 

All we know is that they are called SR1

All we know is that they are called SR1

 

Continue reading The power of single-cell genomics: the mysterious SR1 bacteria have a unique genetic code →

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Minor revisions only

 

A new journal, Molecular Metabolism has the following policies: one week for reviews, and three possible outcomes only: Reject, Accept, or Minor Revision. Good for them on both decisions. Bonus: your editors are  Mr. Blonde, Mr. Blue, Mr. Brown, Mr. Orange and Mr. Pink. And they are professionals (although they may not tip).

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Announcement: Automated Protein Function Prediction Meeting

The Automated Function Prediction, an ISMB 2013 Special Interest Group meeting and CAFA: Critical Assessment of Function Annotations. July 20, 2013, Berlin

Keynote speakers

  • Patricia Babbitt, University of California, San Francisco
  • Alex Bateman, European Bioinformatics Institute
  • Anna Tramontano, “La Sapienza” University, Rome.
Key dates:
    • April 20, 2013: Deadline for submitting extended abstracts posters & talks
    • May 9, 2013: Notifications for accepted abstracts e-mailed to corresponding authors
    • May 16, 2013: Deadline for presenters to confirm acceptance of invitation to speak.
    • July 20, 2013: AFP SIG preceding ISMB/ECCB 2013, Berlin

 

Sequence and structure genomics have generated a wealth of data, but extracting meaningful information from genomic information is becoming an increasingly difficult challenge. Both the number and the diversity of discovered sequences are increasing, and the fraction of genes whose function is known is decreasing. In addition, there is a need for annotation which is standardized so that it could be incorporated into function annotation on a large scale. Finally, there is a need to assess the quality of the available function prediction software.

For these reasons and many more, automated protein function prediction is rapidly gaining interest among computational biologists in academia and industry.

The Automated Function Prediction Special Interest Group (AFP SIG) has been part of ISMB since 2005. We call upon all researchers involved in gene and protein function prediction and annotation, both computational and experimental, to submit an abstract to the AFP meeting. Authors of select abstracts will be invited to give a talk and/or present a poster.

We will also be discussing the upcoming second Critical Assessment of Function Annotations, or CAFA. CAFA 1 was a highly successful experiment, engaging 30 groups worldwide, and has resulted in 16 peer-reviewed papers in Nature Methods and BMC Bioinformatics:

http://www.nature.com/nmeth/journal/v10/n3/full/nmeth.2340.html

http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S3

 

We are looking forward to a new and expanded CAFA 2 in 2013-2014, which will include a cellular component prediction track, and a human-specific track.

 

For further instructions on AFP 2013, please go here: http://BioFunctionPrediction.org

We are looking forward to seeing you in Berlin!

Iddo Friedberg, co-chair, on behalf of the AFP 2013 organizing committee

For continuing information, please subscribe to the following Google Group:  https://groups.google.com/forum/?fromgroups#!forum/afp-cafa

Contact: afp.cafa.2013@gmail.com

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Some omics words we would like to see

Advertisomics: environmental sequencing aimed at obtaining popular press coverage with little or no scientific value. Samples obtained from an environment otherwise not of microbiological interest. “Hey, did you hear they swabbed  the car wheels in the building’s parking lot and found that the microbes all cluster by tire brand name?

Celebromics: sequencing the genome or microbiome of a celebrity. Generally the sequence is not even published, but just the act of sequencing it provides publicity for the lucky lab, the celeb, and maybe even a microbial species or two. “They sequenced the genome of Keith Richards, and found a duplicated set of multiple drug resistance genes.”

Contaminomics: sequencing results published prematurely, and later discovered that the major finding is the result of a contamination.

DuhOmics: unsurprising results from a genomic study. Usually confirming common knowledge that did not require a genomic study in the first place. 

Lazarusomics: sequencing the genome of an extinct animal, including hominids, with the implicit or explicit promise that we will be able, very soon, to reverse the extinction.

Shockomics:  related to advertisomics. Sequencing for shock value and pop publicity. Usually involving human or animal bodily secretions or parts you’d rather not have known about.

TooMuchInformationOmics: A result of the personal genomics and microbiome industry. No, I am not interested in that heel spur gene that you got from your grandmother, nor am I interested  in the novelty of the chlamydia strain they found in your partner’s microbiome.

ZZZomics: an omics paper that makes you fall asleep half way through the introduction.

increasomics

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Critical Assessment of Genome Interpretation, 2013

From the organizers of CAGI 2013. I have been to the Critical Assessment in 2010 and 2011, and even participated as an assessor. It’s a fun meeting, and if your work involves prediction of phenotypes from genotypes, there is still time (just about) to accept some of the challenges.

The Critical Assessment of Genome Interpretation (CAGI) is a community
experiment to assess computational methods for predicting the
phenotypic impacts of genomic variation. The current CAGI experiment
has eight open challenges, available on the CAGI website:
https://genomeinterpretation.org/

In the CAGI experiment, participants are provided genetic variants and
make predictions of resulting phenotypes. Independent assessors then
evaluate these predictions against experimental characterizations.
The primary goals of the experiment are to establish the current
state of the art, identify bottlenecks in genome interpretation,
inform critical areas of future research, and connect researchers
from diverse disciplines whose expertise is essential for advancing
methods for interpreting genomic variation.

The deadline for current CAGI predictions is 28 March 2013.
Anonymous submissions, with limitations, are allowed this year.
https://genomeinterpretation.org/content/anonymity-policy
We encourage use of both established methods and experimental
approaches, and we welcome predictors of all backgrounds.

The current CAGI experiment will culminate in a conference in Berlin,
on 17-18 July 2013, immediately before the ISMB SIGs. An NHGRI R13
grant will help support travel and participation in the meeting.
https://genomeinterpretation.org/content/cagi-2012-conference

Previous CAGI experiments have highlighted striking breakthroughs
as well as disappointing failures. Publications from the previous
CAGI are underway; slides and posters presentations about CAGI may
be found at:
https://genomeinterpretation.org/content/cagi-presentations
The results from the current CAGI challenge will be published as well.

The currently open CAGI challenges are:

+ Seventy-seven PGP genomes (provided by George Church).
Challenge: Predict clinical phenotypes from genome data, and match
individuals to their health records.
https://genomeinterpretation.org/content/PGP2012

+ Exomes of Crohn’s disease patients and healthy individuals (provided
by Andre Franke). Challenge: predict which individuals have Crohn’s.
https://genomeinterpretation.org/content/new-crohns-dataset

+ Exomes from two families with lipid metabolism disorders (provided
by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles
and a causative variant.
https://genomeinterpretation.org/content/FCH
https://genomeinterpretation.org/content/HA

+ Variants in DNA double-strand break repair genes (provided by Sean
Tavtigan). Challenge: predict probability of each variant occurring
in a breast cancer case versus healthy control.
https://genomeinterpretation.org/content/MRN

+ Mutations in p53 gene exons affecting mRNA splicing (provided by
Jeremy Sanford). Challenge: predict how variants impact splicing.
https://genomeinterpretation.org/content/Splicing-2012

+ Variants of a p16 tumor suppressor protein (provided by Silvio
Tosatto). Challenge: predict how well variants inhibit cell
proliferation.
https://genomeinterpretation.org/content/p16_2012

+ Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin).
Challenge: Predict impact of microbial gene disruptions on cell
growth under stress conditions
https://genomeinterpretation.org/content/MR-1_2012

+ riskSNPs disease-associated loci (provided by John Moult). Challenge:
identify potential causative SNPs.
https://genomeinterpretation.org/content/risksnps2012

We are also soliciting challenges for the next CAGI. Please contact us
at cagi@genomeinterpretation.org with proposals for suitable datasets.

In order to access the current challenges and submit predictions for CAGI,
please register at https://genomeinterpretation.org/.

Registered users also have access to presentations from the previous
CAGI conferences, as well as posters and talk slides that summarize
the results.
Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi ‘at’ genomeinterpretation ‘dot’ org

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Adding supplementary tables and figures in LaTeX

A problem I encountered now, when revising a paper and adding a supplement per the editor’s request. How do I number my tables and figures as Table S1, S2 etc.? A solution was provided in Stackexchange, but the syntax was not good for my version of LaTeX, and I don’t like \makeatletter (here’s why). Here is a working solution to supplementary figure and table numbering. Place this bit in your document preamble:

\newcommand{\beginsupplement}{%
        \setcounter{table}{0}
        \renewcommand{\thetable}{S\arabic{table}}%
        \setcounter{figure}{0}
        \renewcommand{\thefigure}{S\arabic{figure}}%
     }

Then, when your supplement starts, just add the line:

\beginsupplement

Voila!  Instant “Table S1” and “Figure S1”. Enjoy.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The Black Queen Hypothesis

ResearchBlogging.org

“Well, in our country,” said Alice, still panting a little, “you’d generally get to somewhere else — if you run very fast for a long time, as we’ve been doing.”

“A slow sort of country!” said the Queen. “Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”

Through the Looking Glass and what Alice Found There   Lewis Carroll

The Red Queen hypothesis is well-accepted in evolutionary biology. Organisms evolve and adapt not to gain an evolutionary advantage, but simply to not fall behind competing organisms that evolve and adapt. Hence, everyone has to “run as fast as they can” (evolve) to “stay in the same place” (reproduce).  It’s a nice hypothesis, and has been shown to be fairly descriptive when dealing with close competitors, such as host-parasite or predator-prey relationships.

Which is why the title of this paper published in mBio has piqued my interest: “The Black Queen Hypothesis: Evolution of Dependencies through Adaptive Gene Loss”. What is the Black Queen hypothesis?

The Red Queen in Alice was a chess piece. (And not, as the authors say in the paper, a card).  The Black Queen is from a card game: namely, the Queen of Spades in the game of Hearts. Hearts is a three to five player card game, and the idea is to avoid taking tricks containing certain cards. Anything in hearts suite is bad, with one penalty point per card. But the Queen of Spades is particularly horrible, with 13 penalty points.  Thus, the idea is to avoid taking hearts or the Queen of Spades.

Are you kidding? I spent three weeks at Camp Winiwinaia on Lake George the summer I was twelve. YMCA camp — poor kids’ camp my mother called it. It rained practically every day, and all we did was play Hearts and hunt The Bitch.

Hearts in Atlantis Stephen King

The authors of the paper use Hearts to set a model explaining reductive evolution in bacteria. Why would some bacterial lineages of free-living bacteria lose genes? How  does an evasion trick card game tie into evolution?

In the context of evolution, the BQH (Black Queen Hypothesis IF) posits that certain genes, or more broadly, biological functions, are analogous to the queen of spades. Such functions are costly and therefore undesirable, leading to a selective advantage for organisms that stop performing them. At the same time, the function must provide an indispensable public good, necessitating its retention by at least a subset of the individuals in the community—after all, one cannot play Hearts without a queen of spades.

One such Black Queen card is the catalase-peroxidase gene, katG. katG protects against hydrogen peroxide (H2O2), a toxic byproduct of marine photosynthesis.  The catalase-peroxidase protein  is iron dependent, and its synthesis can be expensive, especially in an iron-poor environment. Two common marine cyanobacteria are Synenchococcus and Prochlorococcus, which typically are found in  the same communities. Most Prochlorococcus lack the katG gene in their genome, while  Synechococcus do have it.  It seems that in ocean-surface communities,  Synechococcus is holding the katG Black Queen gene in the game, while Procholorococcus  elegantly avoided taking that costly card.  Synechococcus is the workhorse of reducing the toxic H2O2, while the katG-deficient bacteria enjoy the common benefits to all. So it is best to be a member of a lineage that avoids having katG, while living in close proximity to the lineages that have katG. The figure below shows that the entire Prochlorococcus clade (green) lacks katG, but (presumably), living in a community with Synechococcus, allows it to benefit from the katG gene carried by the latter.

Comparison between the phylogenies of the catalase-peroxidase and small subunit rRNA genes for cyanobacteria with sequenced genomes. Although there are some differences in branching order between the two trees, the marine Synechococcus KatG proteins form a well-supported monophyletic clade, implying that this protein was present in the clade’s ancestor and was subsequently lost in several lineages (indicated by red dots on the rRNA tree), including Prochlorococcus. Green, representatives of the Prochlorococcus clade; orange, marine Synechococcus clade; cyan, other Cyanobacteria. Bootstrap values less than 75% are omitted. Only the tree topologies are shown; branch lengths do not represent genetic distances.

Comparison between the phylogenies of the catalase-peroxidase and small subunit rRNA genes for cyanobacteria with sequenced genomes. Although there are some differences in branching order between the two trees, the marine Synechococcus KatG proteins form a well-supported monophyletic clade, implying that this protein was present in the clade’s ancestor and was subsequently lost in several lineages (indicated by red dots on the rRNA tree), including Prochlorococcus. Green, representatives of the Prochlorococcus clade; orange, marine Synechococcus clade; cyan, other Cyanobacteria. Bootstrap values less than 75% are omitted. Only the tree topologies are shown; branch lengths do not represent genetic distances.

 

Leaky Functions

The authors talk about “leaky functions”: functions that provide advantage to the community in a way that is unintentionally altruistic: if an organism  has the ability to extracellularly protect against H2O2, and that species lived in a community, others will benefit. However, the BQH model predicts that lineages will continue to lose leaky functions, as long as at least one lineage maintains it, benefiting the community. Should that lineage lose the leaky function, or be removed form the community, the effects could be devastating to the community now lacking that leaky function.  In other words, leaky-function species eventually become keystone species of their ecosystem.

My two cents worth: I like the model. Like any good model, it provides us with testable hypotheses, and if it works well it will provide predictive powers to evolutionary changes in microbial communities.   It can explain the rarity of some essential functions in a microbial community, and possibly why so many microbes fail to grow in pure culture.  Time will tell how well this model will work.  My only problem is that I am not sure I agree with the title the authors gave their model. Getting the Queen of Spades in Hearts is devastating to your hand (you basically lose). That would be the genetic equivalent of a cell going apoptotic (killing itself) following a cancer mutation or a viral infection.  The BQH model is more subtle, an evolutionary cost-benefit model via the leaky function mechanism. Maybe the “volunteer fire-brigade hypothesis”? Or a generic: “it’s a dirty job but somebody’s got to do it” hypothesis?

Morris, J., Lenski, R., & Zinser, E. (2012). The Black Queen Hypothesis: Evolution of Dependencies through Adaptive Gene Loss mBio, 3 (2) DOI: 10.1128/mBio.00036-12

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

A Belated Valentine’s Day Post

This is romantic!  So listen up!

A 3D heart shape may be drawn using the following implicit function:

2-i-love-math-zedomx-blog1

Or, in Python:

def  heart_3d(x,y,z):
    return (x**2+(9/4)*y**2+z**2-1)**3-x**2*z**3-(9/80)*y**2*z**3

Trouble is, there is no direct way of graphing implicit functions in Python. But anything can be found on Stack Overflow.

Putting it all together:

#!/usr/bin/env python
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib.pyplot as plt
import numpy as np
def heart_3d(x,y,z):
   return (x**2+(9/4)*y**2+z**2-1)**3-x**2*z**3-(9/80)*y**2*z**3

def plot_implicit(fn, bbox=(-1.5,1.5)):
    ''' create a plot of an implicit function
    fn  ...implicit function (plot where fn==0)
    bbox ..the x,y,and z limits of plotted interval'''
    xmin, xmax, ymin, ymax, zmin, zmax = bbox*3
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    A = np.linspace(xmin, xmax, 100) # resolution of the contour
    B = np.linspace(xmin, xmax, 40) # number of slices
    A1,A2 = np.meshgrid(A,A) # grid on which the contour is plotted

    for z in B: # plot contours in the XY plane
        X,Y = A1,A2
        Z = fn(X,Y,z)
        cset = ax.contour(X, Y, Z+z, [z], zdir='z',colors=('r',))
        # [z] defines the only level to plot for this contour for this value of z

    for y in B: # plot contours in the XZ plane
        X,Z = A1,A2
        Y = fn(X,y,Z)
        cset = ax.contour(X, Y+y, Z, [y], zdir='y',colors=('red',))

    for x in B: # plot contours in the YZ plane
        Y,Z = A1,A2
        X = fn(x,Y,Z)
        cset = ax.contour(X+x, Y, Z, [x], zdir='x',colors=('red',))

    # must set plot limits because the contour will likely extend
    # way beyond the displayed level.  Otherwise matplotlib extends the plot limits
    # to encompass all values in the contour.
    ax.set_zlim3d(zmin,zmax)
    ax.set_xlim3d(xmin,xmax)
    ax.set_ylim3d(ymin,ymax)

    plt.show()

if __name__ == '__main__':
    plot_implicit(heart_3d)

Show this to your date on the next Valentine’s Day, because it is too late for this one. Trust me, results are guranteed. Not sure what kind of results though.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

“The thing doesn’t fit is the most interesting”

Richard Feynman passed on Twenty five years ago today. His legacy lies not only in physics, but (to more people perhaps), in his ability to communicate science, and the love of science. One of my favorite Feynman moments is in this video. I show it to students in a course section I teach about the scientific method:

Richard Feynman, May 11, 1918 – February 15, 1988

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The scientific process

Found on 9gag.com.

EDIT: as pointed out by Jason McDermott, hypothesis should probably be used here instead of theory.

From 9gag.com. Click image for original

From 9gag.com. Click image for original

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Life Stands on the shoulders of Giants (Viruses)

ResearchBlogging.org

Back to ancient life, what exactly defines life, and where does life end and non-life begin. One of my favorite subjects, and one of which I am the least knowledgeable. Doesn’t stop me writing about it though.

Viruses are… well… not really life. Or so says common wisdom. They have some elements of life: a genome, the ability to reproduce, and being subject to evolution by natural selection.  But they cannot reproduce independently:  they need to hijack the reproductive machinery of an actual living cell to do that. They do not have a metabolism: they are basically syringes with DNA or RNA, equipped with basic sensors that help them lock onto cells and use them to reproduce, usually destroying their hosts in the process. One thought is that viruses evolved as segments of DNA and RNA that managed to mobilize themselves between different cells. This is the escape hypothesis of viral evolution. Evidence for this hypothesis, at least for certain types of viruses, lies with the very small number of genes many viruses have:  HIV only has 4 genes (or more like 10, depends on how you count). HIV and similar retroviruses appears to originate from  mobile RNA elements or retrotransposons, which use a small number of genes to replicate themselves within genomes. Indeed, it is estimated that 42% of the human genome is composed of some kind of retrotransposon, and in wheat retrotransposons constitute 90%(!) of the genome.  Retroviruses like HIV may be retrotransposons that managed to escape the confines of a single organism, using a rudimentary protein vehicle to transport themselves. They still replicate by integrating into the genome of its host.

Continue reading Life Stands on the shoulders of Giants (Viruses) →

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

A bit more on writing bioinformatic research code

There has been a lot of discussion recently on this blog and others on the need for robust scientific software. Most of the discussion I have been involved in comes from bioinformaticians, because, well, I am one. There has been plenty of talk about code robustness, sharing, and replicability vs. reproduciblity. I do not want to get into this whole debate again, although it is a worthy debate with plenty of interesting things coming out of it. I do want to talk a bit about how to write research software the way I understand it: mostly short scripts with a short lifespan. Those are written in the lab for analyzing data. They are not written as a software product to be used by other labs (although they may eventually be, and you should keep that in mind as you code along). Also, they are not written as computer-science-y software aimed to try a new algorithm or optimize it. So I am not talking about optimizing a new assembly algorithm or such.

I am writing about the software you write to answer a biological question using the sequence data, structure data and/or metadata that you have. Those are the programs that your lab writes in Python, Perl, R or Matlab to classify, analyze, quantify and otherwise shake-and-bake the raw pink data you have until it is nicely and evenly cooked with a fine brown glaze of information coating.

http://umami.typepad.com/umami/images/2007/09/03/oscars_grilled_duck.jpg

Publication-ready.

FSM help you if you write in Perl, though. (Kidding, or maybe not. This post, like most of my blog, science, and life is heavily Python-biased.)

I am also not writing about the code you should have ready when you are publishing: this will come up in a followup post. I am talking about starting out and moving along in your project. And the list is quite partial, with obvious lacunae listed in the end.

Continue reading A bit more on writing bioinformatic research code →

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Reproducible research software: some more thoughts

So there was a lot written over the blogosphere, twittersphere and what-have-you-sphere about the to publish code in scientific research. The latest volley was fired from a post at biostars.org from “JermDemo” which also mentioned my post on making accountable research software by forming a volunteer “Bioinformatics Testing Consortium”. (My post, not my idea). I won’t get into its content, since it is not the main point of this post. You can read it if you like, there are some interesting points there.

Anyhow, Leighton Pritchard commented on his ambivalence with “reproducible research software” in his Facebook page. I’d thought I’d reproduce it (ha!) here (almost) verbatim, as I share many of Leighton’s thoughts and qualms. I especially share his ambivalence, so to me it is refreshing to have a non-dogmatic position voicing some concerns from both sides of the argument.  Thanks for letting me post your thoughts, Leighton.

Continue reading Reproducible research software: some more thoughts →
Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Group review, continued

I love it when other people use my ideas, especially before I think them up. After my previous post advocating group review of scientific articles, it was pointed out to me that two journals are already using group reviews to referee their papers. One is Frontiers (which is a collection of journals, rather than a single journal), the other is eLife .  I have written about Frontiers in a late correction to the previous post. Frontiers is a bit like PLoS-One in terms of the criteria it uses for accepting papers: the science has to be solid, but impact is not considered. They do have a post-publication tiering system, where the articles of higher impact, novelty, and interest “climb” up.

@iddux – sounds like what you are describing is close to the process used by @frontiersin journals – worth taking a look at their system

— Casey Bergman (@caseybergman) December 23, 2012

 

eLife is the recently-launched,  long-awaited “superjournal”, published under the auspices of HHMI and Max-Planck Institutes, and which proposes to be a top-tier, highly selective  journal with a fast peer-review system. (As an aside, Leighton Pritchard has written a thoughtful post on the implications of such a journal when it was first announced.)  Apparently, as part of the review system, they use a group-review system. Here is eLife’s video explaining their pipeline.

And here is Idan Segev, editor in chief of Frontiers Neuroscience, explaining the motivation behind Frontiers Journals peer-review system:

It seems like group peer-review is taking off.  We’ll probably find some snags along the way, but I believe that overall it’s a good thing.

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks