CACAO: Community Assessment of Community Annotation with Ontologies
I’m at College Station airport, Texas, waiting for my delayed flight and hope that the weather in Dallas lets up within the hour. A good time to take a break and blog. College Station is the home of Texas A&M University, which is a place I am always happy to visit. The scientists here are full with creative energy and great ideas. I met some old colleagues, and some new ones.
One really cool project is run by Dr. Jim Hu, Dr. Debby Siegele, Dr. Adrienne Zweifel and Dr. Brenley McIntosh . CACAO is an annotation competition and an educational effort. Students around the world form teams which compete to correctly annotate a given set of genes. The gold standard are experimental publications: there are many papers that have the correct experimental data about gene and gene product function, but they are impossible to text-mine automatically. This is where competing teams of students come in: they receive a set of genes, and employ all means to look them up and properly annotate them. The genes the students are given are initially annotated by homology only: each gene was assigned a function based on the similarity of its sequence to another gene that is annotated with a function. IEA or Inferred by Electronic Annotation is the most common method by which annotations are assigned to genes. It is also the least reliable, as it is not scrutinized by any human, and errors may creep in and often do. The students look for research papers that report experimental evidence for the genes, and correct or validate their annotation.
The scoring system is deliciously cut-throat: after a team posts its own annotations, the other team can look at them. And if team B finds an error in team A’s annotations, team A’s points are transferred to team B.
CACAO is primarily an educational effort, teaching students in-depth exploration of scientific literature, and learn how the scientists discover how genes work: because the teams have incentive to be critical of each other’s annotations, the annotations are of a very good quality.
So if you are a life-science teacher with a good group of students who would be interested in this challenge, Hu & gang are looking for CACAO competitors for the spring semester. Also, if you are student, see if you can talk to the faculty in your program about participating in CACAO, and getting some academic credit for it.
More on CACAO, including contact information at the CACAO site.
Also, here is a presentation, given by Prof. Hu at the 2010 GO CAMP.
Comments are closed.