Crowdsourcing Genomics II: Unveiling HINdeR and Phrux

About this time last year, I posted about a new course I was going to teach, Phage Genomics. Briefly:

Phage isolation, electron microscopy, DNA sequencing in the first semester, annotation and comparative genomics in the second. And I get to teach the bioinformatics bit: annotation and comparative genomics. Woo-hoo! The great thing about this course, is that unlike most lab courses, the students (and faculty) will be setting up experiments intended not only to teach, but also to discover something new.  Also, the results of the research are meaningful. Genomics data generated by student participants will be used by other researchers to answer medical, ecological, and evolutionary scientific questions

The students isolated, sequenced and annotated two previously unknonwn mycobacteriophages, HINdeR and Phrux. The links are to the Mycobacteriophage Database phagesdb.org where the sequences and associated metadata (where and when HINdeR and Phrux were found and isolated) can be found. The annotations will be there shortly.

I had a great time teaching this course, together with Mitch Balish from my department, who is not only a great teacher, but shares my vice for keeping the students guessing when we are being serious and when we are kidding.  Mitch is the guy with the goatee in the short sleeved shirt; I’m the one in the black sweatshirt. Here’s what the students had to say about the course (original site at Miami University). Mitch starts talking at 2:57, I’m at 4:08, Gary Janssen (who taught the first semester) is at 5:08:

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Repost: the Scope(s) of Substance

This tweet from Neil Degrasse Tyson jolted me from a pleasant rest before tomorrow’s race:

 

…which led to the (in)famous Scopes Trial. On May 5, 1925 John Scopes was charged and subsequently tried, found guilty, and fined $100 for teaching Evolution, a violation of Tennessee’s Butler Act. The trial became a battleground for science vs. religion, evolution vs. creationism, and the interpretation of the Establishment Clause and Freedom of Speech in the US constitution.

I published a blog post two years ago, on the 85th anniversary of the trial, July 2010. Today  marks the 87th anniversary of the arrest, so it seems like a good occasion to repost. Especially since there is still some work needed in the area of teaching evolution:

Source Wikimedia Commons. Credit: John D. Croft. Based on: New Scientist Magazine 2006 191:2565 p11

 

To follow is the original post: “The Scope(s) of Substance”,  from July 29, 2010. Still relevant, I believe:


 

Bora Zivkovic, the BUCA (Best Universal Common Ancestor) of science bloggers has tagged this blog with with a Blog of Substance award. As a grateful recipient of this award I am obligated to do two things:
1. Sum up my blogging motivation, philosophy and experience in exactly 10 words.
2. Pass this award on to 10 other blogs.

Of course, I never do anything without researching it first, because I am such an awesome scientist, or detail-oriented !@#*^, depending on whether you ask me or my students. So I looked up “substance” in the Merriam-Webster dictionary. Here is what I found:

Main Entry: sub·stance
Pronunciation: \ˈsəb-stən(t)s\
Function: noun
Etymology: Middle English, from Anglo-French, from Latin substantia, from substant-, substans, present participle of substare to stand under, from sub- + stare to stand — more at stand
Date: 14th century

1 a : essential nature : essence b : a fundamental or characteristic part or quality c Christian Science : god 1b
2 a : ultimate reality that underlies all outward manifestations and change b : practical importance : meaning, usefulness
3 a : physical material from which something is made or which has discrete existence b : matter of particular or definite chemical constitution c : something (as drugs or alcoholic beverages) deemed harmful and usually subject to legal restriction

4 : material possessions : property

Hmmm… 2a and 2b seem to be relevant. Perhaps 3c should be too, as my blogging could be construed harmful to other more productive activities, which I am obviously not engaged with at this moment. Actually you, gentle reader, are not engaged in more productive activities either right now. Be that as it may, the word substance does seem to have an air of permanence about it, which is contrary to the perceived ephemeral nature of blogging. Bora is actually one of the people who are doing something about making blogs less ephemeral by publishing The Open Laboratory collection (full disclosure: I’m published in the 2009 book) and by supporting science bloggers, blogging and activities wherever they may be. This makes me so happy to be among Bora’s chosen 10 (OK, 11, he cheated a bit) among the hundreds of blogs he must be reading. Thanks Bora!

I do wonder though, eighty-five years from now, how many of us science bloggers would be remembered for our blogging? Well, maybe not as individuals, but what kind of impact are we having now, and how much will it remain 85 years from now? Hopefully as a collective, science bloggers are impacting the understanding of science, which is one of the reasons I am blogging. Hopefully, we do have substance, as a group if not as individuals.

Why eighty-five years? Well, the answer to that brings me to the main topic (substance?) part of this post, which is the anniversary of the Scopes trial. This month, 85 years ago, a schoolteacher in Tennessee was convicted of a high misdemeanor for violating the State of Tennessee’s Butler Act which prohibited the teaching of evolution in any of the state’s public schools and universities. He was fined $100.

PUBLIC ACTS

OF THE

STATE OF TENNESSEE

PASSED BY THE

SIXTY – FOURTH GENERAL ASSEMBLY

1925

________

CHAPTER NO. 27

House Bill No. 185

(By Mr. Butler)

AN ACT prohibiting the teaching of the Evolution Theory in all the Universities, Normals and all other public schools of Tennessee, which are supported in whole or in part by the public school funds of the State, and to provide penalties for the violations thereof.

Section 1. Be it enacted by the General Assembly of the State of Tennessee, That it shall be unlawful for any teacher in any of the Universities, Normals and all other public schools of the State which are supported in whole or in part by the public school funds of the State, to teach any theory that denies the story of the Divine Creation of man as taught in the Bible, and to teach instead that man has descended from a lower order of animals.

Section 2. Be it further enacted, That any teacher found guilty of the violation of this Act, Shall be guilty of a misdemeanor and upon conviction, shall be fined not less than One Hundred $ (100.00) Dollars nor more than Five Hundred ($ 500.00) Dollars for each offense.

Section 3. Be it further enacted, That this Act take effect from and after its passage, the public welfare requiring it.

Passed March 13, 1925

W. F. Barry,

Speaker of the House of Representatives

L. D. Hill,

Speaker of the Senate

Approved March 21, 1925.

Austin Peay,

Governor.

Seems incredible at this day an age… or maybe not so incredible given recent events in Louisiana.

William Jennings Bryan, counsel for the prosecution, attacking evolution

The city of Dayton as the organ grinder profiting from the Scopes trial

The trial, which originated as something of a publicity affair for the town of Dayton, Tennessee, quickly became a battleground for evolution vs. creation. In the short term, the trial actually increased the number of anti-evolution bills proposed in different state legislatures in the US. In the long term, however, Tennessee vs. Scopes is seen as a watershed moment in the teaching and public acceptance of evolution, and has had long terms ramifications in the US and internationally. Scopes himself spoke only once at the trial, was not called to testify, and only had this to say when granted a statement after sentence was passed:

Your honor, I feel that I have been convicted of violating an unjust statute. I will continue in the future, as I have in the past, to oppose this law in any way I can. Any other action would be in violation of my ideal of academic freedom — that is, to teach the truth as guaranteed in our constitution, of personal and religious freedom. I think the fine is unjust.

Now that is substance.

Back to the award; I still have some conditions to fulfill:

1. Sum up your blogging motivation, philosophy and experience in exactly 10 words.

1Blogging 2motivation, 3philosophy 4and 5experience 6cannot 7be 8summed 9in 10ten 11words.

2. Pass this award on to 10 other blogs

Given the 10n growth rate of tagged blogs, chain-letter fashion, I wonder about how this Blogging with Substance award has originated. Search engines was no help, as so many blogs are now tagged with the Blogging with Substance. If someone has an answer, let me know. Anyhow, here are my 10 tags, based on what I am reading nowadays, ephemerality of blogging substance, and all that jazz. Tough choices though, so many good blogs out there:

1. Blue Collar Bioinformatics

2. Sandwalk

3. Thoughtomics

4. The Loom

5. Mike the Mad Biologist

6. Genomics, Evolution and Pseudoscience

7. Circle of Complexity

8. Buried Treasure

9. The Tree of Life

10. Mystery Rays form Outer Space

Final word: if this post seems a bit confused, and you are not sure that you are “getting it”, well, that’s this post’s substance.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The Inside Poop

It’s pretty much common knowledge that mother’s milk is the healthiest food for infants, and that it bestows health benefits upon mother and baby that formula feeding cannot match. The unique combination of lipids, sugars, proteins and antibodies is not even close to being rivaled by baby formula manufacturers. With few exceptions, such as when there is a concern that the mother is contagious and may infect the baby, breastmilk is the recommended diet for infants.

As I am interested in things microbiological, I have been especially interested in the effect of breastmilk on the baby gut and gut microbiota. There have actually been quite a few studies on that, but most of these studies were about the gut microbiota only. However,  we can’t really separate our gut from the microbes that reside in it. The bacteria in the human gut affect the gut (and, in turn, the entire body) and are affected by it. The gut is really a superorgan, composed of a minority of human cells, and 1014 bacterial cells. Most of the gut is actually bacteria, not human, but the part that is human is important, since, well, it’s “us”. (Well, kinda hard to tell now which “us” is “us” and which “us” is “the bacteria that live in us”.) To understand what goes on there we need to study both bacterial and human cells. While adult microbiota+gut systems have been studied, mostly for the effect of probiotics, there have not been studies of baby guts because you cannot perform consented invasive procedures on babies. In other words, you cannot scrape their colons for gut lining, or epithelial, cells. So there has not been much of an opportunity to study the gut epithelium+microbiome in human infants.

The opportunity came with Robert (“Robb”) Chapkin from Texas A&M University, and Sharon Donovan from the University of Illinois at Urbana-Champaign. Robb has developed a system to isolate gut epithelial cells from the feces. We shed about millions of cells from our gut when we defecate, and Robb’s lab has a way to fish those gut lining cells out of the stool. Thus, we can sequence the mRNA, and find out which genes are transcribed in the baby gut. At the same time, we can analyze the baby’s microbiome. Enter Sharon Donovan’s lab, who has studied 12 babies,  six were breast fed and six were formula fed.

This is where Robb contacted me, and generously invited me to College Station, Texas about a year and a half ago. Aside from enjoying Texan hospitality (big steaks) and meeting people, Robb brought me into this fascinating study. They needed a bionformatician to help analyze the gut transcriptome and gut metagenome data. I am very glad they contacted me, since this started a very enjoyable collaboration and a scientific journey whose results are published this week  in Genome Biology. I was put in touch with two great statisticians, Ivan Ivanov and Scott Schwartz, also at Texas A&M. We put our heads together, and came up with  a strategy.

Analysis flowchart. Reproduced from Genome Biology 2012, 13:R32 doi:10.1186/gb-2012-13-4-r32 under BMC CC2.0 license. Click to enlarge.

 

ResearchBlogging.org

First, we analyzed the microbiome data, using several standard pipelines, like MG-RAST for function analysis (thanks to the folks at Argonne National Lab and  for making MG-RAST happen  and for all their support) , and PhymmBL and GreenGenes for taxonomic analysis.  The gut transcriptome data were already available, as part of a previous study. Our next step was to look for correlations between the distribution of bacterial phyla in the babies, and whether the type of bacteria they had in their guts had anything to do with their diet.

So here is what we found. First, most breastfed babies had a greater variety of bacterial phyla between them than formula-fed babies. Probably because the formula babies were all fed the same diet, whereas breastmilk composition varies between women. Second, the breastfed babies were richer in gram negative bacteria. Those are bacteria with a thin cell wall, a double cell membrane, and which have certain features that the gram positives (thick cell wall, single membrane) do not have.  Also,  almost all breastfed babies had a richer gut ecosystem.

 

Firmicutes and Actinobacteria are gram+; Proteobacteria and Bacteroidetes are gram-. FF-formula fed babies, BF-breastfed babies. Genome Biology 2012, 13:R32 doi:10.1186/gb-2012-13-4-r32

We then moved on to look at the genetic potential of the gut microbiome: how do the microbial communities differ between the breastfed and bottle-fed babies in terms of what they can do. The strongest difference between breastfed babies and bottle-fed babies was in the presence of virulence genes, and mostly those typical of gram-negatives: Type III & IV secretion systems. There were other differences, such as in carbohydrate processing enzymes. But the kicker was that the differences in the frequency of virulence genes in the microbiome also correlated well with the expression of immunity related-genes in  the infant gut epithelial cells.

Reproduced from Genome Biology 2012, 13:R32 doi:10.1186/gb-2012-13-4-r32

Reproduced from Genome Biology 2012, 13:R32 doi:10.1186/gb-2012-13-4-r32

 

We observed the following: 1. Certain gram negative bacteria are dominant in the breastfed babies. 2. We saw that bacterial genes having to do with virulence were more abundant in the bacterial communities of breastfed babies 3. When looking closely at those genes, we saw that most of them were the virulence factors typical of gram negative bacteria (OK, not surprising given point[2] above, but a good verification). 4. At the same time, the breastfed babies expressed genes that had to do with immunity in their gut lining (epithelial) cells. The presence of virulence genes, and the expression of immunity genes in the gut epithelium correlated quite strongly (see B, below).

 

Reproduced from Genome Biology 2012, 13:R32 doi:10.1186/gb-2012-13-4-r32

 

Taken together, this tells us that the following scenario may apply: mother’s milk tends to enrich certain types of gram negative bacteria, and those, in turn, stimulate the babies’ immune system. It’s as if the mother’s milk is setting up an immunity boot camp for the breastfed babies.

We got all sorts of feedback and even a bit of media coverage on this study. I was really happy when this study hit Reddit. Reddit is an aggregation site where anyone can submit any kind of story, and the “redditors” vote it up or down. Highly voted submissions are more visible, and get discussed more on the site. Generally, having a submission receive many “upvotes”, in Reddit parlance, shows an interest. (Well, the highest upvotes tend to go to pictures of funny kittens, but still.) The story made it  to the top of the r/science category  (also known as “subreddit”) with over 1300 upvotes . I logged in using my real name, and referred people to another subreddit, called IAmA (“I am a…”). In this case “I am a scientist who worked on this study, ask me anything“. There were quite a few questions, and it was a very interesting engagement with people about this work. Hopefully, good PR :) and science communication.

 


Schwartz, S., Friedberg, I., Ivanov, I., Davidson, L., Goldsby, J., Dahl, D., Herman, D., Wang, M., Donovan, S., & Chapkin, R. (2012). A metagenomic study of diet-dependent interaction between gut microbiota and host in infants reveals differences in immune response Genome Biology, 13 (4) DOI: 10.1186/gb-2012-13-4-r32

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

It’s a smORF world, after all?

ResearchBlogging.org

Here is a study that looked for a type of genes that the authors felt was neglected by classic genomic annotation. The research shows how to employed concepts in molecular evolution to validate the existence of these genes.

Some background: the first question we ask after assembling a genome is: “where are the genes”? Not an easy question to answer, since a gene is classically defined as a unit of heredity. It may code for RNA, protein, or sometimes, nothing at all. The actual implementation of the “unit of heredity” can take several physical forms, each one of them different. Therefore, the algorithms for finding genes would depend on which type gene one is looking for, exactly.

A somewhat more tractable question is: “where are the open reading frames”? Open reading frames or ORFs are those stretches of DNA that code for proteins.  Indeed, most gene calling software actually identifies ORFs. There are many attributes that go into an ORF calling algorithm: the frequency of the bases  (or k-mers of bases) in the suspected coding regions, the signals for the beginning and ends of introns, the existence of non-coding regions that aid transcription such as promoters and enhancers, the location on the chromosome with relation to other ORFs, and the length of the of the final product. The latter criterion is actually quite important, as many ORF-calling algorithms will discount anything coding for a protein that is shorter than 100 amino acids as being “too short”. The reason for employing this length cutoff, is that the number of false positives increases dramatically when ORFs coding for proteins shorter than 100aa (or 300 nucleotides) are called. Therefore, most gene-callers would just tend to discard any short peptides.

But throwing away the baby with the bathwater is not a good solution, since short peptides are known to be responsible for many of life’s activities: mating pheromones, small compound transporters, hormones, neurotransmitters and regulation of other proteins’ activities, to name a few. Many of these short peptides are the result of the cleavage of larger proteins, which means that the ORFs encoding for them are originally longer than 300bp.  But some may actually have their own ORFs, coding only for them. How can we find those small ORFs or smORFs out? How many of them are there? Is the number of smORFs large enough to make it worth re-annotating genomes?

Click to enlarge. Gene Structure. Source: Wikimedia commons. Credit: Forluvoft

Emmanuel Ladoukakis from the University of Crete and colleagues from the university of Essex, UK have set up a bioinformatic pipeline to look for smORFs in the Drosophila melanogaster genome. Bear with me, there are a few steps in this pipeline. But there’s a lot to learn about genomics just from looking at what they did, and why they took those steps.

Here’s what they did: 1) Find smORF candidates: they looked for all potential smORFs (starting with a start codon and ending with an in-frame stop codon, 30-300bp long) in those parts of D. melanogaster’s genome that were annotated as non-coding. To keep things simple, they looked only for intron-less smORFs: smORFs that are encoded consecutively in the DNA.  They found 593,586 potential sequences. 2) Remove transposons: they then removed all those that had a similarity to transposons. Transposons are DNA elements that multiply in the chromosome: something like an internal virus, only usually benign. They may carry bits of other genes they “grab” on the way, but they are not functional. They were left with 556,554 sequences 3) Big step: look for homologs in another fly species: they then looked for smORFs with similar  translated amino-acid sequences in D. pseudoobscura, which diverged from the melanogaster  25 to 55 million years ago. The reason they looked for similar amino-acid sequences was that if there is a selection to conserve a smORF, it would be on the protein, and not at the DNA level. This step reduced the number of smORF candidates by 93%: from 556,554 down to 43,210.  Looking only for 4) global alignments, (another big step)  they found 4,561 smORF candidates by looking at alignments of whole smORF sequences, not only of partial local similarities. this reduced the number of candidates by 72% from the  step (3). We are now down to 0.8% of the original 593,586 smORF candidates.

Quite a filtering process. Note the huge elimination: 99.2% of all initial smORFs candidates are gone. I believe that they decided to sacrifice sensitivity in favor of specificity

So they had 4,561 smORF candidates conserved between two flies. Still, how many ORFs got in by chance? Hard to know, but they continued to rely on evolutionary conservation as a guideline. There may be smORFs that appeared independently in melanogaster and pseudoobscura after they separated 55 million years ago,  but the main evidence for true smORFs would be their evolutionary conservation between the two fly species.

To get even more specific, they now 5) looked for shared synteny:  conservation not only of sequence, but also of the genomic context: the sequences surrounding it. That brought the number down to 3,314.

OK, so they looked for conservation based on homology and based on synteny. Anything more? Well, yes. The next step would be to 6) look for evolutionarily selected smORFs. The two evolutionary criteria they used until now were homology and synteny. Now comes a third:  selection. If  smORF candidates are actually coding, they will be subject to  purifying selection, that is, to selection that eliminates deleterious mutations. This is evident in a low rate of non-synonymous vs. synonymous substitutions, or a Ka/Ks ratio of << 1. (Read about Ka/Ks ratios also here.) 7) Looking at what actually gets transcribed in Drosophila (from looking at the transcriptome) this number was whittled down to a final 401.

Click to enlarge. Search pipeline for Drosophila smORFs. Diagram of the smORF search pipeline followed in this study. The percentages of smORFs passing each filter are indicated. For full details, see Results and Materials and methods. CDS, coding DNA sequence; Dm, Drosophila melanogaster; Dp, Drosophila pseudoobscura; Ka/Ks, ratio of non-synonymous (Ka) to synonymous (Ks) nucleotide substitution.Ladoukakis et al. Genome Biology 2011 12:R118 doi:10.1186/gb-2011-12-11-r118

So the chosen 401 smORFs are evolutionarily conserved, both in sequence and in synteny, subject to purifyng selection (by Ka/Ks ratio) and produce a transcript. The authors obviously went for specificity over sensitivity: they looked for “good bet” smORFs rather than a large number of candidates. What I like about this study is the way that the authors used a large number of evolutionary traits that can be used as attributes for identifying smORFs. They also were careful to rule out, as much as possible, that these smORFs that may be a result of a larger transcript. This is a really nice molecular evolution work. There is no experimental evidence yet of the functionality of these smORFs: those are left to future proteomic and fly geneticists. But the idea of a small(er) world of genes, hiding in plain site among the more familiar large ones, does have its appeal, and may yield some surprises about how are genomes are structured.

Finally, for the evolutionary biologists: read the paper; there is quite a lot more to it that what I wrote. I just gave the highlights.

 


Ladoukakis, E., Pereira, V., Magny, E., Eyre-Walker, A., & Couso, J. (2011). Hundreds of putatively functional small open reading frames in Drosophila Genome Biology, 12 (11) DOI: 10.1186/gb-2011-12-11-r118

 

http://genomebiology.com/2011/12/11/R118/abstract

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

And I should go because?

Found this in my inbox:

Dear Dr.Iddo Friedberg,    

Greeting from OMICS Group!

I came across your contribution entitled “Biopython: freely available Python tools for computational molecular biology and bioinformatics” published in the Journal of Bioinformatics and thought your expertise would be an excellent fit for Toxicology-2012 Conference that OMICS Group is hosting.

 

I’m just wondering how many legitimate calls for participation I am missing due to the increasing amounts of conference spam in my inbox.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Biocuration 2012

 

Great meeting:  Biocuration 2012, Georgetown University, DC.  When I leave a meeting with my head exploding with new ideas and a need to try them all out at once, I know I got my money’s worth, and then some. Even a three hour flight delay followed by discovering my car with a dead battery at 1am at the deserted Dayton Airport parking lot did not dampen my enthusiasm upon return. I will make sure my dome light is off before I leave my car  the next time though. To follow are bits and pieces from the meeting I enjoyed. I’m doing this mostly from memory, two days later, so I may have an addendum once I get my notes together.

What is biocuration? Well, anything that has to do with annotating, labeling, indexing, identifying biological entities. Almost exclusively genes in this conference. Genome databases, especially those of model organisms, employ curators to annotate, check and re-annotate the genomic data Here’s a more elaborate explanation, taken from the website of the International Society for Biocuration:

Biocuration involves the translation and integration of information relevant to biology into a database or resource that enables integration of the scientific literature as well as large data sets. Accurate and comprehensive representation of biological knowledge, as well as easy access to this data for working scientists and a basis for computational analysis, are primary goals of biocuration.

The goals of biocuration are achieved thanks to the convergent endeavors of biocurators, software developers and researchers in bioinformatics. Biocurators provide essential resources to the biological community such that databases have become an integral part of the tools researchers use on a daily basis for their work.

 

Day 1 started off with many community annotation tools. I thought that the Wikipedia model for annotation was dead, but maybe I’m wrong. Many community efforts use a large number of experts, as opposed to a huge number of non-experts, which is what the speakers at the first session were discussing. Pombase (whose title drew some chuckles from the French speakers at my table), the Tetrahymna Genome Database Wiki and the Gene Wiki were presented. The Gene Wiki, presented by Andrew Su from TSRI is a bona-fide crowdsourcing approach, not just Wikipedia-like but actually comprised of a set of 10,000 gene definition stubs folded into Wikipedia. Jennifer Harrow from Sanger presented a poster with an accession model of annotations: the “blessed annotator” who has been trained for 3 months and has the run of the wiki, and the “gatekeeper”, who has been trained in a 2-day workshop, and whose contributions need to be monitored. Lots of talks about trusted annotators, etc. Perhaps we should look to cryptography’s “circles of trust” to enable trusted annotations yet increase the number of curators. (I use “curation” and “annotation” interchangeably throughout.)

An afternoon workshop, discussed who are biocurators. If you are a biocurator, there’s a good probability you are 31-50 years young (80%), female (60%), with a PhD (76%), been through the academic mill and found it to be a bad fit for one reason or the other. You like your work, you rarely burn out, it is challenging and stimulating, you are not in it for the money. (Few people in non-industry science are.)  Actually, since non-profit science is run on soft money, funding is a serious concern, and your job may have a shorter half-life that you would care for it to have, as you are probably employed on a 3-5 year contract. Your boss is rarely a biocurator her/himself, which may mean that your job description may sometimes be ill-defined.

After  that, there was a  whole session devoted to curation workflows and tools. If  you are setting up your own genomic database, check these out: WebApollo,  CvManGO and the Reactome. Attila Csordas from EBI presented PRIDE, a tool for curating proteomic data. While proteomic data are growing, there are few choices of software tools to annotate them. So PRIDE is a welcome player in the field.

 Day 2 had a “Genomics, metagenomics comparative genomics” session, only without the metagenomics. :(  What I really liked was the ViralZone resource for viral genomes, out of SIB. High time someone did this for the most abundant biological particle on Earth, and the one responsible for most diversity in life.

The breakout sessions were my favorite, getting a change to interact with like-minded people interested in similar questions. (That is, those that share my prejudices.) I went to the one organized by Marc Robinson-Rechavi and Frederic Bastian which dealt with the question of quality in gene annotation.  Here is the problem: when we annotate a gene with a function (or functions), we also need to say what is the evidence that brought us to think that this gene does what it does. The most popular vocabulary for annotating genes is the Gene Ontology or GO. GO provides us with evidence codes which allow the curator to say what is the evidence for the function they assign to a gene. Those range from experimental evidence codes such as “inferred from mutant phenotype” which are always entered by a human curator, to “Inferred from Electronic Annotation” which have no human oversight. These evidence codes are used as a proxy for quality: people generally tend to accept that evidence from an experiment may be stronger evidence that that gene does what it does than an electronic one. That may not necessarily be true. For example, high-throughput experiments that results in many genes getting assigned with annotations wholesale. Even with the uncharacteristically low) 5% error rate, a single paper used as a source from which 5,000 genes are annotated would result in 25 wrongly annotated genes.  In addition, these types of experiments supply annotations that are not very specific, such as “protein binding” or “embryonic development”, terms that in many cases are too general to be useful. On  the other hand, Nives Škunca of ETH Zurich has shown a beautiful study about how fully automated annotations may not be as inferior to human-curated ones as most people think, with some caveats. (Note: Nives also showed her work in a poster that won the best poster award at the meeting, and this work has just been accepted to PLoS Computational Biology. I will try to blog more about it once it’s published, it’s really brilliant.) The discussion revolved around how we should ascertain the quality of annotations, what would be considered a useful annotation, and how can we establish trustworthiness. Seems like there is quite a bit of work to be done, as people are only beginning to realize that this is a more complex problem than we thought. A major player in this will be the Evidence Ontology or ECO, an elaborate ontology in the making describing lines of evidence for gene annotation.

Day 3: Atilla Csordas, whom I mentioned earlier, organized an unconference session early morning. A few of us gave brief talks there. Ben Good from Andrew Su’s lab talked about biocuration through games, with harnessing  The idea is to do for biocuration what fold.it has done for protein folding. The Dizeez game quizzes you about diseases related to genes, and scores you according to how well you link genes to diseases. But as Andrew says on his blog:

 Generally, the gene-disease links in structured databases will be reasonably correct (though likely not at all complete). When we analyze the game logs in aggregate, we expect that players’ answers will generally reinforce what’s already known. But given enough game player data, also expect that we’ll see multiple instances of gene-disease links that aren’t reflected in current annotation databases. And these are candidate novel annotations.

So there may be something there, although it is not the “wisdom of the crowds” that is being exploited, since I imagine that only people with advanced degrees in their field can contribute to Dizeez. You can see games from the Su lab on genegames.org. Sean Mooney from Buck talked about the Statistical Tracking of Ontological Phrases (STOP) project. The idea here is to automatically enrich GO annotation of genes with other ontologies, to get a more comprehensive description of their function, especially when it comes to disease.  I talked about the Critical Assessment of Function Annotations (we finally submitted the paper, yay!).  Atilla talked about annotating proteomic data.

Great meeting. A big thank you to the organizers, it went without a hitch.  Logistics, food, coffee were all fantastic. Looking forward to Cambridge nest year! EDIT: a virtual special issue of Database has been published for this meeting, Some of the talks are there as papers. Open Access, of course.

Finally, my favorite promotional item from the meeting:

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

You. Want. This. Job.

NSF grant funded, woohoo! Now I am hiring a programmer. So if you want to be part of a dynamic, growing lab, do lots of interesting stuff and upgrade yourself from just a great bioinformatician to a super-bioinformatician, this job’s for you.  You’ll be working primarily on microbial genome evolution, including setting up a kick-butt multi-genome database, and all sorts of interesting distractions.  See below for the nitty-gritty. Original ad here: https://www.miamiujobs.com, job posting number: 0001377 . Pass on to interested parties. Three year position, renewable annually.

Microbiology: Scientific Programmer/Specialist to implement and maintain a genomic database web site; implement data management tools including relational database management applications for efficient storage and retrieval of genomic data; perform other duties as related to the position such as data and project management to ensure data are being processed in an efficient and timely manner; contribute to writing scientific manuscripts.

Required qualifications: BS or BA in Computer Science, bioinformatics, or a related discipline; demonstrated programming experience, particularly in Python and SQL databases; demonstrated web programming experience; knowledge of Linux/Unix; excellent spoken and written communication and documentation skills.

Preferred qualifications: Advanced degree (M.Sc. or Ph.D) or equivalent in Computer Science, Bioinformatics, Molecular Biology or a related discipline; experience in development of bioinformatic algorithms; knowledge of R programming; experience in development of or contribution to open source projects; experience in collaborative software development such as the use of version control software, writing and following software specifications, participation in code review; knowledge of basic molecular biology; experience with genomic browser programming, such as GMOD or equivalent.

Candidates should send a CV or resume and have three letters of reference sent separately to Dr. Iddo Friedberg at Friedberg.lab.jobs ‘at’ gmail ‘dot’ com. Screening of applications begins April 14, 2012 and will continue until the position is filled.

Miami University is an affirmative action/equal opportunity employer with smoke-free campuses. Consumer Information http://www.miami.muohio.edu/about-miami/publications-and-policies/student-consumer-info/. Hard copy upon request.

Ad in PDF.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Dirty Genomics

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Repost: a very loose and circular association to Pi Day

ResearchBlogging.org

(Originally published March 14, 2009)

Happy Pi (π) Day! Americans write dates in the MM/DD/YYYY format instead of the DD/MM/YYYY format used by the rest of the world.  Usually a rather painful and confusing format if you did not grow up with it, causing checks to bounce and leases to expire for those who recently moved to the US, but it has a few benefits: you can take the numeric representation of March 14, and you have the first three digits of Pi. This coincidence is good enough to celebrate a day around the uber-celebrity of numbers. (Heh, I said  “around”). Everybody’s welcome.

This is the day all geeky bloggers come out and try to: (1) show how smart they are; (2) connect Pi, usually in some improbable and tenuous fashion, to whatever theme they have in their blogs and (3) try to make an original observation of pi no one else has made before. So that is exactly what I am going to do today.

Sort of.

Well,  probably not.

Smarts

Well, I remembered Pi day, didn’t I? OK, that does not show I’m smart, just shows my brain is a repository of useless trivia. Look at the time of publication of this post:  March14, 1:59am which is 3.14159. Hey, five digit time stamp that’s smart! (Not very original though, also I’m actually up at this time finishing a grant proposal).

1aym_bio_r_500
Human Rhinovirus capsid. Not a perfect sphere, but close connection to blog theme
A post with a less than tenuous connection to Pi

Some virus capsids are icosahedral. Not really spherical but sort-of. Bacteria have flagella motors that are circular. Micelles are usually spherical.  Microvesicles are spherical. All these are a good start for pi-topics.

Well, too bad. I actually want to write about circular proteins. Only “circular” in this case does not mean “circle shaped”:  hence, we are chucking Pi out the window right now. Stick around though, these proteins are really cool.

Formation of a peptide bond
Formation of a peptide bond

You were probably taught that proteins are linear chains of amino acids that fold into a shape that produces their function. The links connecting the chains are peptide bonds. But there is no real reason why the carboxy terminus (right side) and amino terminus (left side) would not bond themselves.  It just has never been observed, or looked for. Well, they do. And some proteins are circular, like a snake biting its own tail.

Structure and sequence of the cyclotide kalata B1
Structure and sequence of the cyclotide kalata B1

These cyclotides are very robust. For one, they are almost immune to proteases: enzymes that break up proteins. Many proteases attack the edge of the protein (exoproteases, because they start from the “outside”), but there are no edges to attack here. The disulfide bonds, their short length make them immune to endoproteases as well as to heat, pH, etc.

What do cyclotides do?

They protect the organism that produces them.  All kingdoms of life produce cyclotides, everything from bacteria to Rhesus monkeys. (Actually, I am not sure about Archaea). Cyclotides seem to act in different mechanisms: some form holes in the membrane of the attacking microbe;  plant cyclotides stunt the growth of feeding caterpillars. Interestingly, the same plant peptide, Kalata B1 induces uterine contractions in mammals. This is how it was discovered: a physician working in the Democratic Republic of Congo noticed that laboring women were drinking tea made from Oleanda affinis to induce childbirth. Theactive ingredient was the first cyclotide to be discovered. Since then, cyclotides have been shown to be antibiotic, antiviral and insecticidal.

Do humans produce cyclotides?

I could not find anything about that in the literature. So I took the amino acid sequence of a recently discovered monkey cyclotide, rhesus theta defensin 1 (RTD1) sequence and BLASTed it (TBLASTN: protein vs. nucleotide)  against the human genome. No results. Of course, this 5 minute trial proves very little. TBLASTNing short sequences  (the RTD1 is only 18aa long) is a bit sticky. If you are a beginning bioinformatics student looking for a course or rotation project, finding candidate Cyclotides in humans (or in other genomes) might be a good idea.  There are about 100 known sequences, so quite a bit for a training set to start from.  You can build a profile or an HMM, and do some more sensitive searches.

But what about Pi?

Sigh.. well, here is an XKCD oldie but goldie nerd litmus test… enjoy…


Trabi, M. (2002). Circular proteins — no end in sight Trends in Biochemical Sciences, 27 (3), 132-138 DOI: 10.1016/S0968-0004(02)02057-1

PELEGRINI, P., QUIRINO, B., & FRANCO, O. (2007). Plant cyclotides: An unusual class of defense compounds Peptides, 28 (7), 1475-1481 DOI: 10.1016/j.peptides.2007.04.025

Wang, C., Hu, S., Martin, J., Sjogren, T., Hajdu, J., Bohlin, L., Claeson, P., Goransson, U., Rosengren, K., Tang, J., Tan, N., & Craik, D. (2009). Combined X-ray and NMR analysis of the stability of the cyclotide cystine knot fold that underpins its insecticidal activity and potential use as drug scaffold Journal of Biological Chemistry DOI: 10.1074/jbc.M900021200

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The Origin of Gender Symbols in Biology

ResearchBlogging.org

A quick post for International Women’s Day: how did the gender symbols originate in biology? What do ♀ and ♂ actually stand for?

The answer starts in antiquity, when planets and gods were almost synonymous. Religious rites (at least in Europe) were also associated with the working of metals. Thus, each heavenly body was associated with a metal, a god and provided with a proper symbol, thus:

1. Sun (gold) 2. Moon (silver) 3. Saturn (lead) 4. Jupiter (tin) 5. Mars (iron) 6. Mercury (mercury, duh) 7. Venus (copper) After woodcuts by Friz Kredel, published in Stearn 1962.

 

But how did the symbols of Mars (iron) and Venus (copper) migrate to describe sex in biology? It seems obvious to us that of all symbols, that of the god of war be assigned to male, and the goddess of love to female (stereotypes nonwithstanding), but who was the first who did that?

The answer can be traced to one of the greatest biologists of all times: Carl Linnaeus. He is better known for being the father of modern taxonomy: Linnaeus  is the reason that we uniquely identify organisms using genus and species names in Latin grammatical form, a system known as Linneael binomial nomnclature. From Homo sapiens to Escherichia coli, we all owe our scientific names to Linnaeus.

But Linnaeus was also the one to appropriate the planet symbols to biology. In his notes, he used the Venus symbol as shorthand for female and the Mars symbol as shorthand for male. He also used Saturn to denote woody plants, the Sun for annual plants and Jupiter for perennials. As for gender, the Mercury symbol was used by Linnaeus for hermaphrodite plants. However, that symbol’s meaning has changed over the years, at least in scientific shorthand, and is now used to denote virgin female (e.g. in genetic analysis).  Mars was also used by Linnaeus, somewhat confusingly, for biennial plants.

But how did the symbols actually originate? The accepted thought now is that they were derived by the Roman from the Greek initial letters for the planets / deities. So Phosphoros  Φωσφόρος (Greek: “Morning Star” or later the planet Venus) was abbreviated to Φκ and Thouros (Mars) to θρ further contracted over the years, by metal workers, astrologers and alchemists to the modern symbols.

Kronos (saturn); Zeus (Jupiter); Thouros (Mars); Phosphoros (Venus) Stilbon (Mercury). After Stearn 1962

 

William T. Stearn (1962). The Origin of the Male and Female Symbols of Biology Taxon, 11 (4), 109-113

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Microbial Art

 

We have some really talented students in our department. And I don’t just mean the science. I am honored to present the colorful and hilarious microbial artwork of Amber Beckett. Created between gel runs at Natosha Finley’s lab:

Cereal Dilutions. Credit: Amber Beckett

 

Pepe the protein. Credit: Amber Beckett

StreptoCOWcus and Bortadella persussis. Credit: Amber Beckett

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Does Open Access benefit small universities?

There has been quite a lot of chatter recently about different scientific publishing models. Prompted by Elsevier’s support for the Research Works Act, and the resulting proposed  academic boycott.

Let there be no mistake: I value the Open Access (OA )model of publication, for both moral and practical reasons that have been elaborated upon in many other places. Briefly: 1) OA is the right thing to do, as the results of publicly funded research should be available to the public and 2) it is the practical thing to, as the broadest sharing of knowledge possible is fundamental to educational and scientific advances.

This post deals with the difficulty  that still exists on the ground. I see the current author-pays model of OA publishing as still somewhat problematic, with the result of driving many of my colleagues away from OA. One supporting argument for OA is that small universities and four-year colleges, and institutes in developing countries can ill-afford to subscribe to a large number of close-access publications. this places researchers and students at a disadvantage. Therefore, the OA model of publishing does them a favor by reducing subscription fees, granting broader access to publications.

On the the flip side , it is in those very same institutes that researchers have less “disposable income” to pay for publications.  $2500, a typical OA fee, for a lab funded by an R15  (small NIH grants given to less research-intensive institutes) or a small NSF grant is a larger chunk of change than $2500 for a lab holding a couple of R01s (the larger NIH “workhorse” grant). Knowing the limit on these grants, a researcher squeezed for funding would rather budget for an extra month for a graduate student than for OA publication fees.  In way, OA fees are something of a regressive tax: it hurts those with less disposable income more. The OA advocates would say that the money saved by the institute from the reduction of library fees can be rolled into subsidizing publications. Some institutions do that by subscribing to OA journals, thus reducing the publication fees their authors are required to pay. However, many do not, and 10% of $2500 still leaves $2250 to pay.

Yes, PLoS grants “hardship” fee waivers, but many other publishers do not. However, requesting waivers in many cases is something of a dilemma: Prof. SmallU may have the $2500, but using those towards publication would mean running out of lab materials earlier than needed, or letting a graduate student off for the summer. In many cases it is not that the money is not there (after all, Prof. SmallU did manage to fund the research!) but facing this tough choice is problematic, and many people would be reluctant to ask for a subsidy or a waiver. Also, there is hardly any reward in small universities  for publishing in OA. Publishing OA, by itself, figures very little, if at all in promotion & tenure  decisions.

Therefore, when publishing those journals which have both options, OA and closed access, there is very little incentive to shell out the $2500 (usually more in the “two option” journals)  once the paper is accepted. OA-only journals are often shunned altogether.

So what would be the solution? I agree with FakeElsevier that it has to come from the funding agencies. But maybe, instead of OA being a flat-fee, and hence regressive, it can be turned, with the help of the granting agencies, into a progressive fee. After all, those same agencies know how much funding a lab has anyhow, as this must be provided with every grant request, award, and progress report. If a lab is able to demonstrate a “publication hardship”, perhaps an extra subsidy can be given once a paper is accepted, provided it is used towards an OA publication. Knowing that this extra money is there may help nudge Prof. SmallU in the direction of publishing in OA. Also, the subsidy can be contingent upon the university subscribing to the OA journal, thus sharing the burden and creating and incentive for departmental library committees to pressure administrators to allocate funds towards OA access.

As it stands now, the motivation for low-budget labs, the supposed best beneficiaries of OA publishing, is not to publish OA. Unless stronger incentives are given, those labs will continue to get their reading material via those journals their library subscribes to, and through emailed electronic copies. Incidentally, a practice that the scientific publishing industry is starting to notice and is even attempting to stop.

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Music Monday: Androgynetics

My man Joel Griggs (guitar, right) playing with Us, Today at MOTR Pub in Cincinnati. Enjoy the groove.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Science Funding: Aging Researchers and Funding Recipients

Here is a video produced by Sally Rockey and her team showing changes in age distribution of NIH Principal Investigators and medical school faculty. Rockey is NIH’s Deputy Director for Extramural Research, serving as the principal scientific leader and advisor to the NIH Director on the NIH extramural research program. The video compares the average age of NIH PIs to the age of faculty in medical schools over the years 1980-2009. Since about 55% of R01* recipients are in medical schools, this provides an interesting comparison of the faculty age, and the age of the funded faculty, and how this distribution has changed over the years. In 1980 the distribution both of funding and of faculty age was skewed toward the younger faculty, with a lower median age. Over time, the distribution becomes less skewed, with a higher median age which is also closer to the mean age.

Two other things that jump out:

In 1980, less than 1% of PIs were over age 65, and now PIs over age 65 constitute nearly 7% of the total. In parallel, in 1980, close to 18% of all PIs were age 36 and under. That number has fallen to about 3% in recent years.

Also:

Another factor that jumps out is the increasing gap between entry into faculty and receipt of the first R01. Although not all AAMC faculty members apply for NIH research grants, the gap is interesting and suggests that institutions and other non-NIH funding sources are increasingly responsible for research start-up costs.

 

You can read more in the full Rock Talk.  Also, Drugmonkey outlines a five-point plan to fix the aging funded faculty problem.

——————-
* An R01 is the main vehicle for funding single labs by the NIH. It is typically a 5 year renewable grant.

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Life is short

ResearchBlogging.org

Continuing with rather philosophical musings about life, Ed Trifonov has recently suggested a new approach to defining life:  let’s just vote on the definition.
So how does that work? And why should it work in the first place?
Note that I am diving straight into the subject, and not prefacing this post with a review of the various definitions of life. I assume that this blog’s readers have been exposed to some aspects of the debate on how to define life. Wikipedia and the references therein are a decent starting point, in case you want to refresh your memory. But just so we have something, here is the definition from the American Heritage dictionary:
LIFE:  the condition that distinguishes organisms from inorganic objects and dead organisms, being manifested by growth through metabolism, reproduction, and the power of adaptation to environment through changes originating internally.
Trifonov’s rationale for doing what he does is as follows:
The definitions [of life, IF] are more than often in conflict with one another. Undeniably, however, most of them do have a point, one or another or several, and common sense suggests that, probably, one could arrive to a consensus, if only the authors, some two centuries apart from one another, could be brought together. One thing, however, can be done – sort of voting in absentia – asking which terms in the definitions are the most frequent and, thus, perhaps, reflecting the most important points shared by many. Such analysis is offered below, revealing those most frequent terms that may be used for tentative formulation of the consensus.
Where to start?  Trifonov decided to take two book chapters which together list 123 non-redundant definitions of life. He then counted the words in those chapters, omitting connecting words and grouped them by meaning , then ordered them by the definientia (the words serving to define another word or expression) frequency (click to enlarge):

By word count, seems like life has mainly to do with living. Well, no surprises there, but somewhat tautological and less-than-informative.  However, rejoice O system biologists, for SYSTEM is the second most frequent keyword grouping. Then we have organic stuff, CHEMICAL, COMPLEXITY, REPRODUCTION with ENERGY and ABILITY trailing. Trifonov continues:

Thus, the consensus of the life definition patched from these nine definientia would be: Life is [System, Matter, Chemical (Metabolism), Complexity (Information), (Self-)Reproduction, Evolution (Change), Environment, Energy, Ability,…] where the square brackets correspond to some compact expression containing the words listed within. For example, one possibility is:

Life is metabolizing material informational system with
ability of self-reproduction with changes (evolution),
which requires energy and suitable environment.    

(I added the underlines.) Hm.  Actually, not bad for a definition culled from a simple exercise in word counting. Of course, to put these definientia together one would need some knowledge of life, this the exercise is not completely automatic and unbiased, nor does it profess to be so.

But Trifonov wants to condense this definition even more. To quote Hemingway: “boil it down; know what to leave out; tell a story in six words”. Is there still some redundancy in the definientia themselves that would let us boil it down and tell the story in only six words? Trifonov argues that metabolism implies the existence of energy and materials.  Whereas the existence of materials already implies a suitable environment. But self-reproduction subsumes all the above, as it requires metabolism, energy, materials and environment. However, variations and self-reproduction  are actually mutually exclusive. Both must be noted. The boiled down, Hemingwayan definition would therefore be:

 Life is self-reproduction with variations.

And, to top it off, six words! Hemingway achievement unlocked.

Of course, this succinct definitions renders all sorts of problems. trifonov admits to that:

One unforeseen property of the minimalistic definition is its generality. It can be considered as applicable not just to “earthly” life but to any forms of life imagination may offer, like extraterrestrial life, alternative chemistry forms, computer models, and abstract forms. It suggests a unique common basis for the variety of lives: all is life that copies itself and changes.

Here is where I think things go a bit too far: is self-reproducing (and mutating) software alive? Is the Weasel program alive? Are viruses alive? All of those examples fulfill, at least technically, the above definition. In a previous post, I talked about going from life to non-life on a scale roughly correlating to size and thus the amount of information and sustaining materials life can carry with it. The difference between life and non-life seems to be not only in self-reproduction with variations, but the ability to do so at some level of autonomy. When adding the caveat of autonomy, viruses are not alive, since they require the transcriptional and translational machinery of their host cells. Neither are organelles such as mitochondria, since most of their proteins are encoded by the nucleus. But requiring autonomy raises another problem, which I find hard to solve: how far does this requirement of autonomy go? After all, all heteretrophs are, to some degree non-autonomous, as they require basic materials produced only by autotrophs. So the definition becomes fuzzy again: humans are alive, although they cannot self-sustain without plants. Plants are alive, but they cannot fix nitrogen and require bacteria to do so. So are we to say that  autotrophic nitrogen-fixing bacteria the only living species on earth?  So maybe the autonomy criterion be limited to self-reproduction rather than metabolism? But those are hard to separate: without sugar, there is no DNA. Without essential amino acids, which most heretrophs acquire by consuming other organisms, there are no proteins to effect reproduction.

despite the difficulties,  my definition would be (seven words, unfortunately):

 Life is autonomous self-reproduction with variations.

 Fuzzy? You betcha. That’s life.

PS: as you can see from the article’s Pubmed page, it generated a flurry of comments. Those make for a great read too. Enjoy.

 


 

Trifonov EN (2011). Vocabulary of definitions of life suggests a definition. Journal of biomolecular structure & dynamics, 29 (2), 259-66 PMID: 21875147

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks