Computational Bridge to Experiments
A bit of background information: this is a meeting I am really happy to be part of, and even more so honored to be a co-organizer. One of my main scientific interests is the prediction of the function of genes and proteins of unknown function.
Some background information: we have sequenced more than 1000 genomes of microbes, and hundreds of plants and animals. Additionally, we have millions of partial DNA sequences, RNA sequences, proteins, genomic fragments and millions of genes sequenced from metagenomic data. Problem: for most of these sequenced genes, we do not know what they are doing. That’s right: most of the sequence data that we have is just that: data. Not information. We are amassing an ever-growing collection of books that are written in a mostly incomprehensible language. We know (or “educatedly guess”) where the words in those books (the genes) are located, because we have sequence signals that indicate where the bits of the DNA that code for genes is. For some of the words, we know the meaning. But in many cases, (and by some estimates in most cases) we fail to understand the meaning of the words (genes) in those books (genomes). Drawing further on the book<–>genome and gene<–>word metaphor, we sometimes know one meaning of a word, but we all know that words in human languages can hold different meanings, depending on context. “Whatever floats your boat” can be read literally, but more often this particular collection of words in this order is a figure of speech. The same thing goes for genes: a gene may code for a certain enzyme, catalyzing a simple chemical reaction. But in another context, it may perform developmental function for the whole organism, which has different implications than just the biochemical level.
We can’t just rely on computational means to find out what’s doing what. Bioinformatics can help us annotate genes that are similar to those already discovered, and in some cases give us new insights to the function of unknown genes. But for truly novel functions, and to known whether our boat is real or a metaphor for “what works best” we may need to run experiments. And we need a good collaboration between those who do the computational work, and those who do the experimental work in identifying which are the most important books to look at, and what words in them we need to decipher first.
The COMBREX meeting aims to start this large-scale and long-term decoding, a collaboration between experimentalists and computational biologists.
Note that the COMBREX workshop is part of the larger Microbial Genomics meeting at Lake Arrowhead, California.
Here is the announcement. Feel free to cut & paste and forward:
Announcing the first COMBREX Workshop for Computational and Experimental Determination of Protein Function. September 15, 2010 Lake Arrowhead, California USA
COMBREX (Computational Bridge to Experiments) is a new NIH funded effort that aims to increase the pace of experimental determination of the function of large and high priority gene families in bacterial genomes. The Principal investigators are Richard Roberts (New England Biolabs) Simon Kasif (Boston University) and Martin Steffen (Boston University), this effort will form a consortium of experimental and computational biologists that would collaborate directly to test the predicted functions or specificity of high-priority genes.
Central to this effort would be the creation of a community web-based database that will allow computational and experimental scientists to communicate easily and assist experimentalists in identifying high-priority genes with high-quality computational predictions. Experimentalists will be able to submit bids (proposals) to validate individual predictions, and if successful, will receive modest funding from COMBREX to perform the validation.
The website can be found at http://combrex.bu.edu/ .
A workshop to discuss issues related to the formation and operation of COMBREX will take place on Wednesday, September 15, 2010, as part of the 18th Annual International Meeting on Microbial Genomics at Lake Arrowhead, CA, outside of Los Angeles. A preliminary program can be found at http://www.mimg.ucla.edu/arrowhead2010/program.html (COMBREX is formerly SciBay). Confirmed speakers include Richard Roberts, Simon
Kasif, Manuel Ferrer (CSIC, Madrid), Patricia Babbit (UCSF), John Gerlt (Illinois), Peter Karp (SRI), Alexander Yakunin (Toronto), Steven Brenner (UC Berkeley) and Bruno Sobral (Virginia Tech).
The morning session will provide an overview of COMBREX, including both the experimental and computational challenges, related talks, and a
description of topics to be discussed by breakout groups. These groups will convene in the afternoon to discuss the topics and prepare a short summary, for presentation to the entire workshop after dinner.
Topics to be discussed by the breakout groups will roughly divide into the following areas: (1) whole genome annotation, (2) assessment of computational predictions, (3) use of structure to predict function, and (4) infrastructure for function annotation. General topics to be discussed include:
1. How to prioritize predictions?
2. How to evaluate experimental bids?
3. How to handle non-enzymatic proteins?
4. How best to handle predictions/phenotypes from high-throughput experimentation?
A key desired outcome of the workshop is the identification of opportunities and catalysis collaborations between computational and experimental biologists.
We hope you will be able to join us for this event. You can register at: http://www.mimg.ucla.edu/arrowhead2010/registration.html
For further information please contact the organizers:
Co-chairs: Martin Steffen, Boston University, steffen ‘at’ bu ‘dot’ edu
Iddo Friedberg, Miami University, i.friedberg ‘at’ muohio ‘dot’ edu
Steering Committee: Simon Kasif and Richard J. Roberts
Comments are closed.