<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Byte Size Biology &#187; Bioinformatics</title>
	<atom:link href="http://bytesizebio.net/index.php/category/science/biology/bioinformatics/feed/" rel="self" type="application/rss+xml" />
	<link>http://bytesizebio.net</link>
	<description>The musings and ravings of a computational biologist about science, computers, music and, you know, stuff</description>
	<lastBuildDate>Mon, 06 Feb 2012 13:32:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Wikipedia pages on protein function prediction</title>
		<link>http://bytesizebio.net/index.php/2012/02/01/wikipedia-pages-on-protein-function-prediction/</link>
		<comments>http://bytesizebio.net/index.php/2012/02/01/wikipedia-pages-on-protein-function-prediction/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 15:55:20 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Free Culture]]></category>
		<category><![CDATA[Writing]]></category>
		<category><![CDATA[function-prediction]]></category>
		<category><![CDATA[protein-function]]></category>
		<category><![CDATA[wikipedia]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5861</guid>
		<description><![CDATA[I just received an email from Julian Gough , one of last year&#8217;s CAFA participants. He started a Wikipedia initiative on protein function prediction, which are barely stubs at the moment. EDIT: He alerted me to the fact that protein function prediction has virtually no presence on Wikipedia. So all you protein function predictors out there, please contribute. Yes, [...]]]></description>
			<content:encoded><![CDATA[<p>I just received an email from <a href="http://www.cs.bris.ac.uk/~gough/" target="_blank">Julian Gough</a> , one of last year&#8217;s <a href="http://bytesizebio.net/index.php/2011/07/02/cafa-update/" target="_blank">CAFA</a> participants.<span style="color: #000000;"> <del>He started a Wikipedia initiative on protein function prediction, which are barely stubs at the moment</del>.</span> <span><span><strong style="color: #000000; text-decoration: underline;">EDIT</strong><span style="text-decoration: underline;">: He alerted me to the fact that protein function prediction has virtually no presence on Wikipedia</span></span><span style="color: #800000;">.</span></span> So all you protein function predictors out there, please contribute. Yes, you too!</p>
<p>I guess that as a CAFA organizer, I should really contribute to the second page. And I will. But I really don&#8217;t mind if someone else jump-starts it. <img src='http://bytesizebio.net/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p><a href="http://en.wikipedia.org/wiki/Protein_function_prediction" target="_blank">http://en.wikipedia.org/wiki/<wbr>Protein_function_prediction</wbr></a></p>
<p><a href="http://en.wikipedia.org/wiki/Critical_Assessment_of_Function_Annotation" target="_blank">http://en.wikipedia.org/wiki/<wbr>Critical_Assessment_of_<wbr>Function_Annotation</wbr></wbr></a></p>
<p>&nbsp;</p>
<p><a href="http://bytesizebio.net/wp-content/uploads/2012/02/Wikipedia-logo.png"><img class="alignnone size-full wp-image-5862" title="Wikipedia-logo" src="http://bytesizebio.net/wp-content/uploads/2012/02/Wikipedia-logo.png" alt="" width="200" height="200" /></a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2012/02/01/wikipedia-pages-on-protein-function-prediction/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Circumcision, preventing fraud, and icky toilets. You know you&#8217;re going to read this.</title>
		<link>http://bytesizebio.net/index.php/2011/12/04/circumcision-preventing-fraud-and-icky-toilets-you-know-youre-going-to-read-this/</link>
		<comments>http://bytesizebio.net/index.php/2011/12/04/circumcision-preventing-fraud-and-icky-toilets-you-know-youre-going-to-read-this/#comments</comments>
		<pubDate>Sun, 04 Dec 2011 18:23:02 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Free Culture]]></category>
		<category><![CDATA[Metagenomics]]></category>
		<category><![CDATA[Microbiology]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Science publication]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5710</guid>
		<description><![CDATA[In no particular order or ranking, recent and not-so-recent articles from PLoS-1. The common thread (if any): I thought they were pretty cool in one way or another. &#160; 1. Men don&#8217;t tell the truth about their penis. No kidding? But this is somewhat more serious. It has been accepted for some time that male [...]]]></description>
			<content:encoded><![CDATA[<p>In no particular order or ranking, recent and not-so-recent articles from PLoS-1. The common thread (if any): I thought they were pretty cool in one way or another.</p>
<hr/>
&nbsp;</p>
<p>1.<strong> Men don&#8217;t tell the truth about their penis.</strong> No kidding? But this is somewhat more serious. It has been accepted for some time that male circumcision dramatically reduces the rate of HIV infection. But recently, some reports have shown that high rates of infection prevail among circumcised men as well. But since circumcision is usually self-reported, could there be a problem there? This study shows that in a cross-sectional (sorry&#8230;) study among recruits to the Lesotho Defense Force, 50% of the men that reported they were circumcised were, in fact, partially (27%) or completely (23%) not circumcised. The researchers conclude that biases in the self-reporting of male circumcision may lead to erroneous reports that show high HIV infection rates among circumcised men.</p>
<p><span style="text-decoration: underline;">Concluding quote:</span></p>
<blockquote><p>&#8230;until further research can document improved methods for obtaining accurate self-reported MC [male circumcision <em>I.F.</em>] data, all assessments of MC and HIV prevalence, as well as projections for VMMC [voluntary male medical circumcision <em>I.F.</em>] interventions, should be informed by physical-exam-based data [as opposed toself reporting, <em>I.F.</em>].</p></blockquote>
<p><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0027561">http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0027561</a></p>
<p><span style="float: center; padding: 5px;"><a href="http://www.researchblogging.org"><img style="border: 0;" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" alt="ResearchBlogging.org" /></a></span></p>
<hr/>
2. <strong>Share your data or GTFO. </strong></p>
<p>Can sharing data help prevent errors and fraud?</p>
<p>From the abstract:</p>
<blockquote><p><strong>Background</strong>: The widespread reluctance to share published research data is often hypothesized to be due to the authors&#8217; fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically</p></blockquote>
<p>So <a href="http://wicherts.socsci.uva.nl/" target="_blank">Jelte Wicherts</a> and his colleagues from the University of Amsterdam wanted to see whether sharing data was related to the number of statistical analysis errors in a paper. So, to phrase this as a null and alternative hypothesis:</p>
<p><strong>H0:There is no difference in the number of statistical errors in those papers where the authors are willing to share data, and those where the authors are unwilling to do so.</strong></p>
<p><strong>H1: (one sided): the number of weaker evidence and statistical errors in papers where the authors are unwilling to share data is larger than those in which the authors are willing to share data.</strong></p>
<p>Wicherts and colleagues contacted authors of 141 papers published in five journals of the American Psychological Association, requesting their data. Trouble is, they could not get enough authors to share data to make their own study significant: in a <a href="http://psycnet.apa.org/journals/amp/61/7/726/" target="_blank">previous study</a>, some 73% of the authors contacted were unwilling to share data. Wow.</p>
<p>However, authors publishing in two of these journals, <em>Journal of Personality and Social Psychology (JPSP)</em> and <em>Journal of Experimental Psychology: Learning, Memory, and Cognition (JEP:LMC),</em> were somewhat more forthcoming.  Wicherts and colleagues therefore limited their analysis to a subset of 49 papers published in those journals. (Note that sometimes lack of data sharing is due to legitimate considerations, such as being part of an ongoing study, or third-party proprietary rights. However, those were not considerations in 49 papers analyzed here.)</p>
<p>Wicherts  then checked for specific types of statistical errors in these papers, and compared the number of errors in papers from authors willing to share data to those who did not. Here are some of the findings:</p>
<div id="attachment_5719" class="wp-caption alignnone" style="width: 624px"><a href="http://bytesizebio.net/wp-content/uploads/2011/12/data-errors.png"><img class="size-large wp-image-5719 " title="data-errors" src="http://bytesizebio.net/wp-content/uploads/2011/12/data-errors-1024x962.png" alt="" width="614" height="577" /></a><p class="wp-caption-text">Distribution of the number of errors in the reporting of p-values for 28 papers from which the data were not shared (left column) and 21 from which the data were shared (right column) for all misreporting errors (upper row), larger misreporting errors at the 2nd decimal (middle row), and misreporting errors that concerned statistical significance (p&lt;.05; bottom row). doi:10.1371/journal.pone.0026828.g001</p></div>
<p>&nbsp;</p>
<p>Pretty clear picture: those papers where the authors authors were willing to share data were less prone to statistical errors.</p>
<p>Concluding quote:</p>
<blockquote><p>In this sample of psychology papers, the authors&#8217; reluctance to share data was associated with more errors in reporting of statistical results and with relatively weaker evidence (against the null hypothesis). The documented errors are arguably the tip of the iceberg of potential errors and biases in statistical analyses and the reporting of statistical results. It is rather disconcerting that roughly 50% of published papers in psychology contain reporting errors <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3174372" target="_blank">[33]</a> and that the unwillingness to share data was most pronounced when the errors concerned statistical significance.</p></blockquote>
<p>Although note that Wicherts is very careful about drawing conclusions:</p>
<blockquote><p>Although our results are consistent with the notion that the reluctance to share data is generated by the author&#8217;s fear that reanalysis will expose errors and lead to opposing views on the results, our results are correlational in nature and so they are open to alternative interpretations. Although the two groups of papers are similar in terms of research fields and designs, it is possible that they differ in other regards. Notably, statistically rigorous researchers may archive their data better and may be more attentive towards statistical power than less statistically rigorous researchers. If so, more statistically rigorous researchers will more promptly share their data, conduct more powerful tests, and so report lower p-values. However, a check of the cell sizes in both categories of papers (see Text S2) did not suggest that statistical power was systematically higher in studies from which data were shared.</p></blockquote>
<p>&nbsp;</p>
<p>In fact, Wicherts also wrote a <a href="http://www.nature.com/news/psychology-must-learn-a-lesson-from-fraud-case-1.9513" target="_blank">piece in <em>Nature</em></a> where he argued that sharing data can help avoid fraud, such as in the recent <a href="http://www.nature.com/news/2011/111101/full/479015a.html" target="_blank">infamous case of Diederik Stapel</a>, a highly regarded psychologist at Tilburg University in the Netherlands.</p>
<p><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026828">http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026828</a></p>
<hr/>
<p>3. <strong>Toilet paper. </strong>A study of surfaces of public restrooms has shown that they are covered with bacteria, mainly the kind that is known to live on and in humans. So now we have a somewhat broader view of the species living in restrooms, including the uncultured ones.</p>
<p>Two interesting quotes from the paper:</p>
<blockquote><p>Although many of the source-tracking results evident from the restroom surfaces sampled here are somewhat obvious, this may not always be the case in other environments or locations.</p></blockquote>
<p>Not sure about this bit: if the sources here are obvious, then is this paper a proof-of concept?</p>
<p>Also:</p>
<blockquote><p>Unfortunately, previous studies have documented that college students (who are likely the most frequent users of the studied restrooms) are not always the most diligent of hand-washers.</p></blockquote>
<p>No shit! (Pun intended).</p>
<p>Concluding quote:</p>
<blockquote><p>Although the methods used here did not provide the degree of phylogenetic resolution to directly identify likely pathogens, the prevalence of gut and skin-associated bacteria throughout the restrooms we surveyed is concerning since enteropathogens or pathogens commonly found on skin (e.g. <em>Staphylococcus aureus</em>) could readily be transmitted between individuals by the touching of restroom surfaces.</p></blockquote>
<p>Translation:</p>
<p><a href="http://bytesizebio.net/wp-content/uploads/2011/12/washhands.jpg"><img class="alignnone size-full wp-image-5718" title="washhands" src="http://bytesizebio.net/wp-content/uploads/2011/12/washhands.jpg" alt="" width="342" height="477" /></a></p>
<p><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028132">http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028132</a></p>
<hr />
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=PLoS+ONE&amp;rft_id=info%3Adoi%2F10.1371%2Fjournal.pone.0027561&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Voluntary+Medical+Male+Circumcision%3A+A+Cross-Sectional+Study+Comparing+Circumcision+Self-Report+and+Physical+Examination+Findings+in+Lesotho&amp;rft.issn=1932-6203&amp;rft.date=2011&amp;rft.volume=6&amp;rft.issue=11&amp;rft.spage=0&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pone.0027561&amp;rft.au=Thomas%2C+A.&amp;rft.au=Tran%2C+B.&amp;rft.au=Cranston%2C+M.&amp;rft.au=Brown%2C+M.&amp;rft.au=Kumar%2C+R.&amp;rft.au=Tlelai%2C+M.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Medicine%2CPsychology%2CHealth%2CEpidemiology%2C+Public+Health%2C+Human+Factors">Thomas, A., Tran, B., Cranston, M., Brown, M., Kumar, R., &amp; Tlelai, M. (2011). Voluntary Medical Male Circumcision: A Cross-Sectional Study Comparing Circumcision Self-Report and Physical Examination Findings in Lesotho <span style="font-style: italic;">PLoS ONE, 6</span> (11) DOI: <a href="http://dx.doi.org/10.1371/journal.pone.0027561" rev="review">10.1371/journal.pone.0027561</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=PLoS+ONE&amp;rft_id=info%3Adoi%2F10.1371%2Fjournal.pone.0026828&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Willingness+to+Share+Research+Data+Is+Related+to+the+Strength+of+the+Evidence+and+the+Quality+of+Reporting+of+Statistical+Results&amp;rft.issn=1932-6203&amp;rft.date=2011&amp;rft.volume=6&amp;rft.issue=11&amp;rft.spage=0&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pone.0026828&amp;rft.au=Wicherts%2C+J.&amp;rft.au=Bakker%2C+M.&amp;rft.au=Molenaar%2C+D.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Mathematics%2CPsychology%2CHuman+Factors%2C+Quantitative+Psychology%2C+Probability+and+Statistics">Wicherts, J., Bakker, M., &amp; Molenaar, D. (2011). Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results <span style="font-style: italic;">PLoS ONE, 6</span> (11) DOI: <a href="http://dx.doi.org/10.1371/journal.pone.0026828" rev="review">10.1371/journal.pone.0026828</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=PLoS+ONE&amp;rft_id=info%3Adoi%2F10.1371%2Fjournal.pone.0028132&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Microbial+Biogeography+of+Public+Restroom+Surfaces&amp;rft.issn=1932-6203&amp;rft.date=2011&amp;rft.volume=6&amp;rft.issue=11&amp;rft.spage=0&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pone.0028132&amp;rft.au=Flores%2C+G.&amp;rft.au=Bates%2C+S.&amp;rft.au=Knights%2C+D.&amp;rft.au=Lauber%2C+C.&amp;rft.au=Stombaugh%2C+J.&amp;rft.au=Knight%2C+R.&amp;rft.au=Fierer%2C+N.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth%2CMicrobiology+%2C+Epidemiology%2C+Bioinformatics%2C+Metagenomics">Flores, G., Bates, S., Knights, D., Lauber, C., Stombaugh, J., Knight, R., &amp; Fierer, N. (2011). Microbial Biogeography of Public Restroom Surfaces <span style="font-style: italic;">PLoS ONE, 6</span> (11) DOI: <a href="http://dx.doi.org/10.1371/journal.pone.0028132" rev="review">10.1371/journal.pone.0028132</a></span></p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/12/04/circumcision-preventing-fraud-and-icky-toilets-you-know-youre-going-to-read-this/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Short bioinformatics hacks: reading mate-pairs from a fastq file</title>
		<link>http://bytesizebio.net/index.php/2011/11/10/short-bioinformatics-hacks-reading-mate-pairs-from-a-fastq-file/</link>
		<comments>http://bytesizebio.net/index.php/2011/11/10/short-bioinformatics-hacks-reading-mate-pairs-from-a-fastq-file/#comments</comments>
		<pubDate>Thu, 10 Nov 2011 15:55:15 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Biopython]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5618</guid>
		<description><![CDATA[If you have a merged file of paired-end reads, here is a quick way to read them using Biopython: from Bio import SeqIO from itertools import izip_longest # Loop over pairs of reads readiter = SeqIO.parse(open(inpath), "fastq") for rec1, rec2 in izip_longest(readiter, readiter): print rec1.id # do something with rec1 print rec2.id # do something [...]]]></description>
			<content:encoded><![CDATA[<p>If you have a merged file of paired-end reads, here is a quick way to read them using <a href="http://biopython.org">Biopython</a>:</p>
<pre class="brush:python">from Bio import SeqIO
from itertools import izip_longest
# Loop over pairs of reads
readiter = SeqIO.parse(open(inpath), "fastq")
for rec1, rec2 in izip_longest(readiter, readiter):
    print rec1.id  # do something with rec1
    print rec2.id  # do something with rec2
    .
    .
</pre>
<p>izip_longest is fed the same iterator, readiter, twice. However, readiter.next(), which advances the iterator, is called on the first argument and then on the second argument. Since next() is being called on the same iterator, successive records are yielded.</p>
<p>By &#8220;merged file&#8221; I mean a fastq file where the mate-pairs are one after the other, as in:</p>
<pre><strong>@HWUSI-EAS687_112864999:8:1:1980:1055#CGAGAA/1</strong>
GTTTGTTTTAATTTCAGTGATTCATCAATTTTAAAAAAAGATGAGAATAATAACTATTATAAAAAGATAAATAAATGTGAAATTTATATTTCAAATTCAA
+
@:DGBGDDD@GGGDGDGDDGD@GGGGE@GGG?EBGGGADDDDGEG4?3BA*::7:GEGGGG&gt;EDDDDAG@G&gt;&lt;ADDGBGGGGEGGGGDGGGFEGGGEFDE
<strong>@HWUSI-EAS687_112864999:8:1:1980:1055#CGAGAA/2</strong>
AATGAATTGAATAAATATAAGAAGGATGATTAATAATAATTCTTGAATTTGAAATATAAATTTCACATTTATTTATCTTTTTATAATAGTTATTATTCTC
+
D?DB:@8EBDB&gt;GG:=&lt;DED79&gt;&gt;A8CEC8DGDGG8CEC&lt;BGGG+BAAEA@D&lt;2D71;:8AG&lt;ABBEEEEBEDC?C&gt;AACDDDCD&gt;AD&lt;@EFFDDDECBB
<strong>@HWUSI-EAS687_112864999:8:1:2274:1058#CGAGAA/1</strong>
CCTCAGTTAGCTTCTATTGGTATTAACATGGGTGAATTTACTAAACAATTTAATGACCAAACTAAAGATAAAAATGGTGAAGTTATACCTTGTATAATTA
+
GFGGGHHGHHHHHHGHHHHHGHHHHHHHFBGDBGEHHHHFHHEHHHHDFHCGFFFHHHHHHHGHHGGEBHEEFFCEE@E&gt;A&gt;&gt;8A@EBE@BBB&gt;BGEEDB
<strong>@HWUSI-EAS687_112864999:8:1:2274:1058#CGAGAA/2</strong>
AACTGGAGTTGTTTTAATTTCAAAAGTAAAAGATTTATCTTTAAATGCTGTAATTATACAAGGTATAACTTCACCATTTTTATCTTTAGTTTGGTCATTA
+
IIIIIIIIIIGIIIDHHIIIIDIHD8CGGGGDADEIIIIIIIHIIGBGD&gt;DGDGGDGIGIIIIBGDG@GFHIIII&lt;C&lt;CCGHHHIHIBGDEEB3BEDEE@
</pre>
<p>The solution is derived from <a href="http://stackoverflow.com/questions/1657299/how-do-i-read-two-lines-from-a-file-at-a-time-using-python">this Stackoverflow entry</a>.</p>
<p>Of course, if the mate-pair files are not merged then you can use this script to merge them. Also illustrates using iterators from two different files in one <font type="Monospace12"><strong>for</strong></font> loop:</p>
<pre class="brush:python">
#!/usr/bin/env python
from Bio import SeqIO
import itertools
import sys
import os
def merge_fastq(fastq_path1, fastq_path2, outpath):
    outfile = open(outpath,"w")
    fastq_iter1 = SeqIO.parse(open(fastq_path1),"fastq")
    fastq_iter2 = SeqIO.parse(open(fastq_path2),"fastq")
    for rec1, rec2 in itertools.izip(fastq_iter1, fastq_iter2):
        SeqIO.write([rec1,rec2], outfile, "fastq")
    outfile.close()

if __name__ == '__main__':
    outpath = "%s.merged.fastq" % os.path.splitext(sys.argv[1])[0]
    merge_fastq(sys.argv[1],sys.argv[2],outpath)
</pre>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/11/10/short-bioinformatics-hacks-reading-mate-pairs-from-a-fastq-file/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Friedberg Lab is Recruiting Graduate Students</title>
		<link>http://bytesizebio.net/index.php/2011/10/18/the-friedberg-lab-is-recruiting-graduate-students/</link>
		<comments>http://bytesizebio.net/index.php/2011/10/18/the-friedberg-lab-is-recruiting-graduate-students/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 15:03:43 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Evolution]]></category>
		<category><![CDATA[Metagenomics]]></category>
		<category><![CDATA[Microbiology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Biopython]]></category>
		<category><![CDATA[graduate school]]></category>
		<category><![CDATA[jobs]]></category>
		<category><![CDATA[lab recruitment]]></category>
		<category><![CDATA[web tool]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5549</guid>
		<description><![CDATA[&#160; The Friedberg Lab is recruiting graduate students, for both Master&#8217;s and Ph.D. WE ARE:  A dynamic young lab  interested in gene, gene cluster and genome evolution, understanding microbial communities and microbe-host interactions by metagenomic analyses, developing algorithms for understanding gene cluster evolution, and prediction of protein function from protein sequence and structure. YOU ARE: [...]]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>The Friedberg Lab is recruiting graduate students, for both Master&#8217;s and Ph.D.</p>
<p><strong>WE ARE</strong>:  A dynamic young lab  interested in gene, gene cluster and genome evolution, understanding microbial communities and microbe-host interactions by metagenomic analyses, developing algorithms for understanding gene cluster evolution, and prediction of protein function from protein sequence and structure.</p>
<p><strong>YOU ARE</strong>: an independent, hard-working problem-solving, energetic and motivated scientist-to-be. You have graduated or are about to graduate in computer science and/or biology or related fields. The Friedberg Lab is a &#8220;dry&#8221; lab, so some programming skills are required (Python preferred).</p>
<p>Existing and planned projects include:</p>
<p>1. Computational protein function prediction and assessment of function prediction algorithms. The Friedberg Lab is among the leaders of the <a href="http://bytesizebio.net">Critical Assessment of Function Annotations</a> (CAFA), an international effort of dozens of research groups to asess and improve function prediction algorithms. We are looking for students that are excited about prediction of protein function from sequence and structure. Also, how well can we assess how well our algorithms are doing? The next CAFA meeting will take place in Berlin, July 2013 and the Friedberg Lab will play a central role in  answering these questions.</p>
<p>2. <a href="http://en.wikipedia.org/wiki/Metagenomics" target="_blank">Metagenomics</a>:  we are studying the interaction between the microbiome and the host using metagenomic and metatranscriptomic data. In collaboration We are looking at how the human microbiome affects gene expression in the host. Together with Robb Chapkin&#8217;s lab at Texas A&amp;M we are analyzing microbial genomes and their effect on transcription in the human gut. We are also developing algorithms for context-based function prediction in metagenomic data. Simply put: how well can we prediction the function of a gene from its neighbors? Since many of the genes in metagenomic data have no known homologs, we are developing creative ways to computationally discover their function.</p>
<p>3. <span style="text-decoration: underline;">Microbial Evolution</span>: we are researching the evolution of Mycoplasma, a bacteria genus which serves us as model clade for understanding genome evolution. Mycoplasma have the smallest genomes of any organism, and being parasitic evolve quickly. Together with the Balish Lab we expect to sequence several new species and strains in the next year, and we are developing computational methods and a central community database  for analyzing the Mycoplasma tree of life. Besides the biological aspect, <strong>this project is also a great opportunity to get into web programming, database design, and learn how top design and code community-based scientific software. </strong></p>
<p>4. <a href="http://biopython.org/" target="_blank">Biopython</a>: Biopython is a set of freely available tools for biological computation written in <a title="http://www.python.org" href="http://www.python.org/" rel="nofollow">Python</a> by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics. If you would like to become a Biopython developer, part of an international community of open-source scientific software developers, the Friedberg Lab is the place for you. This option is especially attractive for Master&#8217;s students seeking to enter bioinformatics in Industry.</p>
<p>5. Insert your brilliant idea here! I love new projects!</p>
<p>The lab is equipped with its own 10-node cluster computer, several workstations, and has access to <a href="http://www.units.muohio.edu/uit/research/high-performance-computing/redhawk-cluster">Miami University&#8217;s Supercomputing Center</a>, and the <a href="http://www.osc.edu/" target="_blank">Ohio Supercomputer Center</a> at Ohio State University.  Students have an excellent research environment, and many opportunities to collaborate with labs on and off campus.</p>
<p>Students can apply to the Friedberg Lab via the following graduate programs at Miami University:</p>
<p>1. <a href="http://microbiology.muohio.edu/grad/" target="_blank">Microbiology</a> (Master&#8217;s and PhD).</p>
<p>2. <a href="www.cas.muohio.edu/cmsb" target="_blank">Cell, Molecular and Strcutural Biology</a> (PhD only).</p>
<p>3. <a href="http://www.eas.muohio.edu/departments/cse/cse/" target="_blank">Computer Science</a> (Master&#8217;s only).</p>
<p>You are welcome and encouraged  to inquire further. I love talking with prospective students. If you would like to set up a phone/Skype chat please send your CV to:</p>
<p>friedberg.lab.jobs &#8216;at gmail &#8216;dot&#8217; com</p>
<p>Looking forward to hearing from you.</p>
<p>&nbsp;</p>
<p><a href="http://iddo-friedberg.net" target="_blank">Iddo Friedberg</a>, PhD</p>
<p>Assistant Professor, Microbiology and Computer Science (affiliate)</p>
<p>Miami University</p>
<p>Oxford, OH, USA</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/10/18/the-friedberg-lab-is-recruiting-graduate-students/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Friday fun story: extreme bug hunting on MIRA</title>
		<link>http://bytesizebio.net/index.php/2011/09/02/5389/</link>
		<comments>http://bytesizebio.net/index.php/2011/09/02/5389/#comments</comments>
		<pubDate>Fri, 02 Sep 2011 20:16:31 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Funny]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[assembly]]></category>
		<category><![CDATA[geek]]></category>
		<category><![CDATA[MIRA]]></category>
		<category><![CDATA[short read sequencing]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5389</guid>
		<description><![CDATA[MIRA is a really cool sequence assembly software, developed and maintained by Bastien Chevreux. MIRA has a large and active community, led by the funny and gracious Bastien, for whom no problem is too small, or too large. Recently MIRA seemed to have developed a stochastic bug, one of those which are a serious headache [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://chevreux.org/projects_mira.html" target="_blank">MIRA</a> is a really cool sequence assembly software, developed and maintained by <a href="http://chevreux.org/" target="_blank">Bastien Chevreux</a>. MIRA has a large and active community, led by the funny and gracious Bastien, for whom no problem is too small, or too large.</p>
<p>Recently MIRA seemed to have developed a stochastic bug, one of those which are a serious headache to track down. Bastein called upon the MIRA community to help him. A couple of weeks ago, the &#8220;bug&#8221; was resolved to everyone relief. It was not a bug at all, but &#8230; well, I&#8217;ll let you read Bastien&#8217;s letter. Probably th funniest and geekiest error report I have seen since, well, ever. Reproduced here from the <a href="http://www.freelists.org/archive/mira_talk" target="_blank">mira_talk</a> email list with Bastien&#8217;s permission. <b>WARNING:</b> fairly geeky and fairly long. Not for everyone. But if you, like me, enjoy a good story travails of extreme bug hunting, I guarantee you will not be disappointed. (Because we have all been there, although personally I don&#8217;t recall encountering a problem <i>that</i> frustrating). Teaser: it was not a bug.</p>
<p><font face="Courier"><br />
Dear all,</p>
<p>my warmest thanks to the numerous people who all donated time and computing power to hunt down a &#8220;bug&#8221; (see http://www.freelists.org/post/mira_talk/Call-for-help-bughunting) which. in the end, turned out to be a RAM defect on my development machine.</p>
<p>This is the story on how the problem got nailed. It involves lots of hot electrons, a lot less electrons without spin which keel over, the end of a hunt for invisibugs of the imaginary sort, 454, mutants (but no zombies), Illumina, some spider monkeys, PacBio, a chat with Sherlock and, of course, an anthropomorphed star.</p>
<p>In short: don&#8217;t read if you&#8217;ve got more interesting things to do on a Friday morning or afternoon.</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
Life&#8217;s a rollercoaster and there are days &#8211; or weeks &#8211; where morale is on a pretty hefty ride: ups and downs in fast succession &#8230; and the occasional looping here and there.</p>
<p>Today was a day where I had &#8211; the first time ever &#8211; ups and downs occuring absolutely simultaneously. Something which is physically impossible, I know, but don&#8217;t tell any physicist or astronomist about that or else they&#8217;ll embark you in a lengthy discussion on how isochronicity is a myth by telling you stories on lightning, thunder and two poor sobs at the ends of a 300,000 km long train. But I digress &#8230;</p>
<p>So, my lowest low and highest high today were at 09:17 this morning when I prepared leaving for work (hey, it&#8217;s vacation time, almost everyone else is out and I can go a bit later than usual, right?). A few minutes earlier I had just told MIRA to run on the very same PacBio test set she had successfully worked on the night before to see how stable assemblies with this kind of data are (quite well so far, thank you for asking).</p>
<p>Reaching out to switch off the monitor and leave, MIRA suddenly came back with a warm and cosy little error message which she&#8217;s taken the habit lately to have a mischievous pleasure to present. This time, she claimed there had been an illegal base in the FASTQ file.</p>
<p>&#8220;Hey, MIRA, wait a minute!&#8221; I thought. &#8220;Yesterday and tonight you ran on the very same data file with the very same parameters for two times three hours and even gave me back some nice assembly results. And now you claim that the INPUT data has errors?! Come on, you&#8217;re not serious, are you?&#8221;</p>
<p>As a side note: she then just gave me back &#8220;that look&#8221;, you know, the one with those big open eyes behind by long, dark lashes and slightly flushed cheeks accompanied by pointed lips &#8230; as if she wanted to say &#8220;I *am* innocent and *I* did no nothing wrong you disbeliever!&#8221; (http://24.media.tumblr.com/tumblr_lj3efmmDL01qasfhmo1_400.png). This usually announces a major pouting round of hers, something which I&#8217;m not looking forward to, I can tell you.</p>
<p>Two restarts later with the same negative result (MIRA can be quite stubborn at times) I had to give in and decided to sit down again and investigate the problem.</p>
<p>&#8220;So &#8230; read number 317301 at base position 246, eh? Let&#8217;s have a look.&#8221;</p>
<p>*clickedyclick*</p>
<p>&#8220;Read 317299, 317300 &#8230; 317301 &#8230; there we are.&#8221;</p>
<p>*hackedyhack*</p>
<p>&#8220;Base position 239, 240 &#8230; now: C G G G T C F A A &#8230; wait! What? &#8216;F&#8217; &#8230; &#8216;F&#8217;?!? It&#8217;s not even an IUPAC code. What&#8217;s a frakking &#8216;F&#8217; doing in the FASTQ input file?! (CSFW: http://www.youtube.com/watch?v=r7KcpgQKo2I )</p>
<p>Indeed, it is not. Even more mysterious to me was the fact that just the night before it apparently had not been there. Or had it? I now was pretty unsure where this path would lead me, as if I had unlocked a door with the key of imagination. Beyond it: another dimension &#8211; a dimension of sound, a dimension of sight, a dimension of mind. I was moving into a land of both shadow and substance, of things and ideas. I just crossed over into &#8230; the Twilight Zone (&#8220;G#-A-G#-E-G#-A-G#-E&#8221; at 128 bpm, for more info see http://www.youtube.com/watch?v=zi6wNGwd84g).</p>
<p>Where was I? Ah, yes, the &#8216;F&#8217;.</p>
<p>So, how did that &#8216;F&#8217; appear in the FASTQ, and where had it been the night before? Out to town, ashamed of not being a nucleotide and getting a hangover without telling anyone up-front? Or did it subreptitiously sneak in from the outside, murdering an innnocent base and taking its place in hope no one would note? I didn&#8217;t have the slightest clue, but I was determined to find that out.</p>
<p>First thing to check: the log files of the successful runs the previous night. MIRA&#8217;s very chatty at times and tidying up after her has always been a chore, but now was one of those occasions where not gagging her paid out as poking around the files she left behind proved to be interesting. Read 317301 showed the following at the position in doubt: &#8220;C G G G T C ___G___ A A&#8221; Without question: a &#8216;G&#8217;, and no &#8216;F&#8217; in sight!</p>
<p>So MIRA had been right and the &#8216;G&#8217; in the sequence of the file mysteriously mutated into an &#8216;F&#8217; overnight. I must admit that I had grown suspicious of her in the past few weeks as she had seemed to become uncooperative at times. In particular she had been screaming at me a couple of times during rehearsal of combined 454 and Illumina assemblies for the premiere of her new 3.4.0 show. She claimed that some uninvited spider monkeys (http://dict.leo.org/ende?search=Klammeraffe) had frightened her so much she refused to continue to work and simply scribbled the &#8216;@&#8217; sign all over her error messages. I had not been able to find out how those critters entered MIRA&#8217;s data and had even enrolled a few volunteers to rehearse different assemblies with MIRA &#8230; to no avail as she&#8217;d performed without flaws there.</p>
<p>While reconsidering all these things, something suddenly made *click*.</p>
<p>The character &#8216;G&#8217; has the hexadecimal ASCII table code 0&#215;47 (or in 8-bit binary: 01000111). &#8216;F&#8217;, as preceding character of &#8216;G&#8217; and the table having some logic behind it, has the hex code 0&#215;46, which is 01000110 in 8-bit binary.</p>
<p>The ATINSEQ-bug (@-in-seq) I had been desperately hunting in the past few weeks (and which had held up the release of MIRA 3.4.0) was due to the &#8220;@&#8221; character sometimes mysteriously appearing in sequences during the assembly of MIRA. The &#8216;@&#8217; sign in the ASCII table has the hex code 0&#215;40 (binary: 01000000). In the ASCII table, there is one important character for DNA assembly which is very near to the &#8216;@&#8217; character &#8230;, so near that it is the successor of it: the &#8216;A&#8217; character. Hexadecimal 0&#215;41, binary 01000001.</p>
<p>I had always thought that a bug in MIRA somehow corrupted the sequence, but what if &#8230; what if MIRA was actually really innocent?! I had never taken this possibility into account as this other explanation attempts would have seemed to far stretched.</p>
<p>But now I had a similar effect *outside* of MIRA, in the Linux filesystem!</p>
<p>Filesystem MIRA<br />
G 01000111 A 01000001<br />
F 01000110 @ 01000000</p>
<p>The difference between the characters is in both cases exactly 1 bit which changes, and it&#8217;s even at the same position (last one in a byte) and changing into the same direction (from &#8217;1&#8242; to &#8217;0&#8242;.</p>
<p>I was now sure I was on to something: bit decay (http://en.wikipedia.org/wiki/Bit_rot)</p>
<p>But how could I prove it? Well, elementary my dear Watson: When you have eliminated the impossible, whatever remains, however improbable, must be the truth.</p>
<p>Suspects:<br />
- the problem is caused either by MIRA or one of the components of the<br />
comeputer: CPU, disk, disk/dma controller, RAM.</p>
<p>Facts:<br />
- an artefact was very sporadically observed during MIRA runs where sequences<br />
(containing lot&#8217;s of &#8216;A&#8217;) suddenly contained at least one &#8216;@&#8217;. This occured<br />
after several passes, i.e., not on loading.<br />
- an artefact was observed in the Linux filesystem where a &#8216;G&#8217; mutated<br />
suddenly and overnight to a &#8216;F&#8217;.<br />
- both artefacts are based on one bit flipping, perhaps even to the same<br />
direction all the time.<br />
- when loading data, MIRA does not use mmap() to mirror data from disk, but<br />
physically creates a copy of that data.<br />
- MIRA loaded the data twice flawlessly before the artefact in the filesystem<br />
occured.</p>
<p>Deduction 1:<br />
- MIRA is innocent. The artefact in the filesystem happened outside of the<br />
address space of MIRA and therefore outside her control. MIRA cannot be<br />
responsible as the Linux kernel would have prevented her from writing to<br />
some memory she was not allowed to.</p>
<p>Further facts:<br />
- the system MIRA ran on had 24 GiB RAM<br />
- even with a KDE desktop, KMail, Firefox, Emacs and a bunch of terminals<br />
open, there is still a lot of free RAM (some 22 to 23).<br />
- Linux uses free RAM to cache files</p>
<p>Deduction 2:<br />
- when loading the small FASTQ input file in the morning, Linux put it into<br />
the file cache in RAM. As MIRA almost immediately stopped without taking<br />
much memory, the file stayed in cache.</p>
<p>Further facts:<br />
- the drive with the FASTQ file is run in udma6 mode. That is, when loading<br />
data the controller moves the data directly from disk to RAM without going<br />
via the processor<br />
- subsequent &#8220;loading&#8221; of the same FASTQ into MIRA or text viewer like &#8216;less&#8217;<br />
showed the &#8216;F&#8217; character always appearing at the same place.</p>
<p>Deduction 3:<br />
- the CPU is innocent! It did not touch the data while it was transferred from<br />
disk to RAM and it afterwards shows always the same data.<br />
- the disk and UDMA controllers are innocent! Some of the glitches observed in<br />
previous weeks occured during runs of MIRA, inside the MIRA address space,<br />
long after initial loading, when UDMA had already finished their job.</p>
<p>From deductions 1, 2 &#038; 3 follows:<br />
- it&#8217;s not MIRA, not the CPU, nor the disk &#038; UDMA controller</p>
<p>Suspects left:<br />
- RAM<br />
- Disk</p>
<p>Well, that can be easily tested: shut down the computer, restart it and subsequently look at the file again. No file cache in RAM can survive that procedure. Yes, I know, there are some magic incantations one can chant to force Linux to flush all buffers and clear all caches, but in that situation I was somehow feeling conservative.</p>
<p>Low and behold, after the above procedure the FASTQ file showed an all regular, good old nucleic acid &#8216;G&#8217; in the file again. No &#8216;F&#8217; to be seen anywhere.</p>
<p>Deduction 4:<br />
- the disk is innocent.</p>
<p>Deduction 5:<br />
- as all other components have been ruled out, the RAM is faulty.</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>As I wrote: life&#8217;s a rollercoaster.</p>
<p>Up: MIRA is innocent! There, she&#8217;s giving me &#8220;that look&#8221; again and one would<br />
have to be blind to oversee the &#8220;told you so&#8221; she&#8217;s sending over with<br />
it.<br />
Down: My RAM&#8217;s broken and I need to replace it. Bought it only last May,<br />
should still be under guarantee, but still &#8230; time and effort.<br />
Up: I did not sell my old RAMs, so I can continue to work<br />
Down: 12 GiB feels soooooo tight after having had 24.<br />
Up: I can wrap up 3.4.0 end of this week with good conscience!<br />
Down: How the hell am I gonna tie all loose bits and pieces in the<br />
documentation in the next 24 to 48 hours?<br />
Looping: today MIRA again helped me at work to locate a mutation important for<br />
one of our Biotech groups. Boy, do I love sequencing and MIRA.</p>
<p>Have a nice Friday and a good week-end,<br />
Bastien</p>
<p>PS: while celebrating with MIRA tonight, I expressed my fear that some people<br />
might find it strange that I anthropomorphise her. They could think I went<br />
totally nuts or that I needed an extended vacation (which I do btw). She<br />
reassured me that no one would dare thinking I were insane &#8230; and if so,<br />
she would come over to their place and give them &#8220;that look.&#8221;</p>
<p>How utterly reassuring.</p>
<p></font><br />
</</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/09/02/5389/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Postdoc positions available at Rutgers University</title>
		<link>http://bytesizebio.net/index.php/2011/08/31/postdoc-positions-available-at-rutgers-university/</link>
		<comments>http://bytesizebio.net/index.php/2011/08/31/postdoc-positions-available-at-rutgers-university/#comments</comments>
		<pubDate>Wed, 31 Aug 2011 15:01:14 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[jobs]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5369</guid>
		<description><![CDATA[Postdoctoral Research Scientist Rutgers University Joint Appointment: Institute of Marine and Coastal Sciences, BioMaPS and Dept. of Biochemistry and Microbiology Two 2-3 year Postdoctoral Research Scientist positions are available. We are looking for young scholars with experience in the areas of computational biology. In the scope of this project, we will uncover how the metal-containing [...]]]></description>
			<content:encoded><![CDATA[<h2>Postdoctoral Research Scientist</h2>
<h2>Rutgers University</h2>
<h6>Joint Appointment: Institute of Marine and Coastal Sciences, BioMaPS and<br />
Dept. of Biochemistry and Microbiology</h6>
<p>Two 2-3 year Postdoctoral Research Scientist positions are available.<br />
We are looking for young scholars with experience in the areas of<br />
computational biology. In the scope of this project, we will uncover how<br />
the metal-containing enzymes responsible for the critical electron<br />
transfer reactions that turn basic elements such as H, O, C, S, and N<br />
into biologically active molecules have evolved. The position will<br />
involve developing new sequence and/or structure based bioinformatic<br />
approaches to (1) mine available databases for proteins responsible for<br />
bio-catalyzed electron transfer reactions, (2) establish evolutionary<br />
relationships between extracted sequences and structures and (3)<br />
generate hypotheses for how the electron transfer circuitry arose and<br />
now functions. Candidates should have a PhD in Computational Biology or<br />
Bioinformatics. Candidates with degrees in related fields (e.g. biology,<br />
computer science) and possessing the necessary skill-sets are welcome to<br />
apply. We strongly encourage applications from recent PhD graduates.<br />
Strong programming skills (at least one of: Perl, Python, or Java) are<br />
essential for these positions, as well as, some familiarity with the<br />
major bioinformatics tools and databases. Experience in machine<br />
learning algorithms is desired, but not required. Candidates should be<br />
fluent in spoken and written English and should be able to communicate<br />
ideas and results to colleagues from all the diversity of life sciences.<br />
The ability to integrate into a team is as essential as that to complete<br />
a project without constant supervision.</p>
<p>Interested persons should e-mail a cover letter and C.V. to:</p>
<p>Dr. Yana Bromberg,<br />
Dept. of Biochemistry and Microbiology,<br />
Rutgers University<br />
e-mail: yanab &#8216;at&#8217; rci &#8216;dot&#8217; rutgers &#8216;dot&#8217; edu</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/08/31/postdoc-positions-available-at-rutgers-university/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Of Mice and Men or: Revisiting the Ortholog Conjecture</title>
		<link>http://bytesizebio.net/index.php/2011/08/26/of-mice-and-men-or-revisiting-the-ortholog-conjecture/</link>
		<comments>http://bytesizebio.net/index.php/2011/08/26/of-mice-and-men-or-revisiting-the-ortholog-conjecture/#comments</comments>
		<pubDate>Fri, 26 Aug 2011 17:18:36 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Evolution]]></category>
		<category><![CDATA[homology]]></category>
		<category><![CDATA[mouse]]></category>
		<category><![CDATA[ortholog study]]></category>
		<category><![CDATA[orthology]]></category>
		<category><![CDATA[paralogy]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5050</guid>
		<description><![CDATA[I  have posted quite a few times before about the acquisition of new functions by genes. In many cases a gene is duplicated, and one of the duplicates acquires a new function. This is one basic evolutionary mechanism of acquiring new functions. Sometimes, gene duplication occurs within a species: part of the chromosome may be [...]]]></description>
			<content:encoded><![CDATA[<p>I  have posted quite a few times before about the <a href="http://bytesizebio.net/index.php/2009/02/03/enzyme-promiscuity/">acquisition</a> of <a href="http://bytesizebio.net/index.php/2009/06/03/glowing-like-a-horse/">new</a> <a href="http://bytesizebio.net/index.php/2009/10/01/it-aint-necessarily-so/" target="_blank">functions</a> by genes. In many cases a gene is duplicated, and one of the duplicates acquires a new function. This is one basic evolutionary mechanism of acquiring new functions.</p>
<p>Sometimes, gene duplication occurs within a species: part of the chromosome may be duplicated, causing one, a few, or many genes to have more copies of themselves within the species. The descendants of the duplicates and the original are <em>homologous</em> are they are descended from a common ancestor. This type of homology is called <em>paralogy</em>: a homology due to a duplication event (para == in parallel).</p>
<p>In another case, the genes can be homologous due to speciation: a new species (A1) diverges from the original (A0), carrying highly similar genetic loads. The gene for, say, brown eyes in A1 and the gene for brown eyes in A0 are also homologous: derived from the gene of hemoglobin in A0. This time, the homology is called <em>orthology</em>: it is not due to in-species duplication, but due to speciation itself (ortho == exact).  The definitions of orthologs and paralogs were given by Walter Fitch in <a href="http://sysbio.oxfordjournals.org/content/19/2/99.short" target="_blank">a seminal paper published in 1970.</a></p>
<p>One of the first protein structures to be solved was that of hemoglobin, the oxygen carrying protein complex in our blood. Scientists noticed that hemoglobin in jawed mammals has three different protein chains: alpha, beta and gamma. Their amino acid sequences were very similar, suggesting that the genes encoding for hemoglobin are highly similar, suggesting homology. Since all jawed mammals have hemoglobin, and they all had alpha, beta and gamma chains, the conclusion was that the duplication of the original genes happened in the common ancestor of  jawed mammals, before they split up into different species. Hence, the alpha, beta and gamma chains in hemoglobin are <em>paralogous</em>: homologous due to duplication preceding speciation. However, gamma-hemoglobin was shown to have a different function than beta or alpha (more on that in a bit).  The conclusion from this observation was the <em>Ortholog Conjecture </em>and it can be stated as follows: paralogs (reminder: homologs due to duplication) diverge in function more than orthologs (homologs due to speciation). A model was proposed for this observation: when genes duplicate within a species&#8217; genome, there is less selective pressure on one copy to perform the same function. Thus, it can accumulate mutations and eventually adopt a different function. The ortholog conjecture states that paralogs mostly differ in function, whereas orthologs mostly do not. The ortholog conjecture is a very powerful statement because, if we have two proteins known to be orthologs, we can infer that they have the same function, whereas paralogs may not (if they had enough time to diverge). The ortholog conjecture is therefore a fundamental tenet in molecular phylogenetics, and is also a tool used to predict the function of proteins. If two homologous proteins are found out to be orthologs, then it is assumed they have the same (or highly similar) functionality.</p>
<p>A crack in the ortholog conjecture was formed in study published late 2009 in a paper published by Romain A. Studer and Marc Robinson-Rechavi. I <a href="http://bytesizebio.net/index.php/2009/10/01/it-aint-necessarily-so/" target="_blank">blogged </a> then about <a href="http://dx.doi.org/10.1016/j.tig.2009.03.004" target="_blank">their study</a>:</p>
<blockquote><p>Romain A. Studer and Marc Robinson-Rechavi challenge common wisdom by publishing a <a href="http://dx.doi.org/10.1016/j.tig.2009.03.004" target="_blank">study that says</a>: “it ain’t necessarily so”. They look at three alternative models of molecular function evolution: (i) subfunctionalization after duplication; (ii) neofunctionalization after duplication; and (iii) the ‘alternative model’ of equal change after duplication or speciation. <em>Subfunctionalization</em> holds that after duplication, each of the two copies of the gene performs only a subset of the functions of the ancestral single copy. <em>Neofunctionalization</em> holds that one of the two genes possesses a new, selectively beneficial function that was absent in the population before the duplication. The ‘alternative model’ states that the gain of new function is not preferential to paralogs and that orthologs may gain new functions at the same rate that paralogs do.</p>
<p>Studer and Robinson-Rechavi claim that few studies have been made to study the scope of any of these proposed models. They then lay out study designs for doing so, challenging other evolutionary biologists (and themselves?) to conduct these studies and examine whether the common wisdom that orthologs maintain function while paralogs gain function. What I like about this paper is that it not only makes a strong case for challenging conventional wisdom, it also lays out a series of possible routes of study to be taken up by others.</p></blockquote>
<p>Now two studies have widened this crack to a rather large crevasse. The first is a study by scientists in Indiana University. In a way, this <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002073" target="_blank">new publication</a> is a response to Studer &amp; Robinson-Rechavi&#8217;s call to arms on points (i) and (ii). The IU scientists (the Radivojac lab and the Hahn lab at the School of Informatics at Indiana University, Bloomington, IN) examined hundreds of pairs of orthologous and paralogous genes from the mouse and human genomes. They then examined whether paralogs had a higher functional similarity, or rather orthologs.  What they found certainly defied the ortholog conjecture:</p>
<p>&nbsp;</p>
<div id="attachment_5062" class="wp-caption alignnone" style="width: 727px"><a href="http://bytesizebio.net/wp-content/uploads/2011/06/journal.pcbi_.1002073.g001.png"><img class="size-large wp-image-5062 " title="journal.pcbi.1002073.g001" src="http://bytesizebio.net/wp-content/uploads/2011/06/journal.pcbi_.1002073.g001-1024x289.png" alt="" width="717" height="202" /></a><p class="wp-caption-text">The relationship between functional similarity and sequence identity for human-mouse orthologs (red) and all paralogs (blue). (A) Biological pathway (B) molecular function. From PLoS Comput Biol 7(6): e1002073 under CC licence.</p></div>
<p>&nbsp;</p>
<p>But before we explain the results, a word about function. The <a href="http://bytesizebio.net/index.php/2010/06/12/protein-function-promiscuity-moonlighting-and-philosophy/">function of a protein has several aspects which are context-dependent</a>; two important ones are the molecular function of the protein, and the biological process in which it participates. For example, the molecular function of all hemoglobins  is noted as  oxygen binding and oxygen transport. However, they are different in the processes, or pathways, in which they participate: gamma-hemoglobin participates in the transport of oxygen in the fetus. The complex which contains gamma-hemoglobin has a higher affinity to oxygen, and thus able to extract oxygen in the placenta from the maternal <a href="http://en.wikipedia.org/wiki/Fetal_hemoglobin#Overview" target="_blank">oxygenated hemoglobin and transport it to the fetus</a>.</p>
<p>Now we can explain the figure above. Graph <strong>(A)</strong> above shows the functional similarity for the biological pathway aspect and how it is affected by the sequence identities of the hundreds of orthologs (red) and paralogs (blue) examined between human and mouse. Graph <strong>(B)</strong> shows the functional similarity of the molecular function aspect.</p>
<p>The X-axis is the sequence identity percentage between any pair of sequences: the higher the percent identity, the less divergent are the sequences, the more inclined we should be to think that the pair of proteins performs the same function since they diverged less. The Y-axis shows the fraction of functional similarity. Looking at graph <strong>(B)</strong> above, we see that paralogs which are 100% identical, have (almost always) the same function . But sequences of orthlogous proteins between human and mouse have only about 65% functional similarity, on average. What does that mean? In the database they looked at, each gene has a set of words associated with it, describing what it does. The IU scientists found that only about 65% of the keywords in orthologous sequence pairs overlapped, on average. Whereas for paralogs 100% overlapped. And those are for sequences which are identical! This means that even if we find identical protein sequences in human and in mouse, it does not mean that they have the same molecular function. On the other hand, paralogs, will generally have more similar functions. So the ortholog conjecture has been stood on its head here: paralogs are the ones that would generally have the same function, whereas orthologs diverge more in function. This holds true for up to about 50% sequence identity, when the picture seems to reverse itself. Graph <strong>(A)</strong> depicts the differences in the biological pathway aspect. Here, the differences are even more striking. The paralogs which are 90-100% identical between human and mouse participate in almost exactly the same pathways in both organisms. But orthologous proteins which are 90-100% identical the functional similarity is much lower:  only about 65%.<br />
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org"><img style="border: 0;" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" alt="ResearchBlogging.org" /></a></span></p>
<p>So what does this all mean?</p>
<p>First, it means that, at least between human and mouse, paralogs are better predictors of function than orthologs. And why would that be? To answer this question, let&#8217;s look closer at the graphs above. Note that while for paralogs the functional similarity decreases rapidly with sequence similarity, for orthologs the functional similarity remains roughly the same no matter how similar or different the orthologs are to each other, and even when they are 100% identical their functions vary to some extent! The reason: the experimental study of function in two human and in mouse  takes place in different contexts. The species-specific context is what causes the differences in annotation, and in the overall function. Also, all the orthologs in the study are of the same age, dating back to the human-mouse lineage split 75 million years ago. The paralogs predate that split, and may be of different ages: the split may predate the human / mouse split by 10 million years, 100 million years, or 1 billion years. Thus orthologs, regardless of their actual sequence similarity, have the same age, and paralogs do not. But why should proteins of the same age share the same level of (not so high) functional similarity? The authors of the study reply:</p>
<blockquote><p>While there is no direct role for “time” in evolution that is not tied to mutation, we suggest that what time represents here is the evolution of the cellular context: the sum of the evolutionary changes over all of the directly and indirectly interacting molecules. If this context evolves at a steady rate (i.e. the average amount of functional change among all of the interacting molecules remains relatively constant), then protein function will appear to evolve at a steady rate, a rate largely disconnected from the level of an individual protein&#8217;s sequence divergence. &#8212; <em>PLoS Comput Biol, Vol. 7, No. 6.</em></p></blockquote>
<p>The strongest evidence they find for this hypothesis, is that even proteins with 100% are annotated differently. To wit:</p>
<blockquote><p>For example, Liao and Zhang [50] found that &gt;20% of genes that are essential for viability in humans are not essential in mouse. It is unlikely that changes to the proteins themselves have made them essential or not, but rather that their context in cellular and organismal networks has evolved. &#8211;<em>ibid.</em></p></blockquote>
<p>The proteins may not have changed substantially, but their environment changed, giving them a different role. Think about changing jobs after moving to a new place where there is no employer providing your exact old job you were used to. You may have been an embedded systems programmer, but now you are a website programmer. So context goes a long way to explain changes in ortholog function.</p>
<p>Interestingly, about a month after the IU paper was published, <a href="http://bib.oxfordjournals.org/content/early/2011/06/16/bib.bbr031.full" target="_blank">another paper</a> from the Robinson-Rechavi lab was published, which also talks about homologs between human and mouse. In this study Gharib and Robinson-Rechavi reviewed previous literature listing several types of functional divergence of orthologs between human and mouse. They had some additional findings. For example, about 11% of the orthologous genes were alternatively spliced, meaning that the end products, proteins, were different between human and mouse.  They also listed specific phenotypic effects: genes which are linked to diseases in humans, but mutations in their mouse orthologs have no effects on mice. They cite studies that found that over 20% of genes which are essential in human are non-essential in mice (an <em>essential gene</em> is just that: if the organism does not have it, or it is mutated, the effects are fatal, and the organism does not develop past very early stages).  Their literature review concluded that 10-20% of ortholog pairs between human and mouse cannot be used for functional transfer. The IU study implies a higher percentage. Both studies conclude that a common practice in molecular evolution studies, the use of orthologs to infer function, should be seriously looked at.</p>
<p>(Full disclosure: Dr. Radivojac &amp; I are collaborators, although our collaboration is unrelated to this study).</p>
<hr />
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=PLoS+Computational+Biology&amp;rft_id=info%3Adoi%2F10.1371%2Fjournal.pcbi.1002073&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Testing+the+Ortholog+Conjecture+with+Comparative+Functional+Genomic+Data+from+Mammals&amp;rft.issn=1553-7358&amp;rft.date=2011&amp;rft.volume=7&amp;rft.issue=6&amp;rft.spage=0&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pcbi.1002073&amp;rft.au=Nehrt%2C+N.&amp;rft.au=Clark%2C+W.&amp;rft.au=Radivojac%2C+P.&amp;rft.au=Hahn%2C+M.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CBioinformatics%2C+%2C+Computational+Biology%2C+Evolutionary+Biology">Nehrt, N., Clark, W., Radivojac, P., &amp; Hahn, M. (2011). Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals <span style="font-style: italic;">PLoS Computational Biology, 7</span> (6) DOI: <a href="http://dx.doi.org/10.1371/journal.pcbi.1002073" rev="review">10.1371/journal.pcbi.1002073</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=Briefings+in+Bioinformatics&amp;rft_id=info%3Adoi%2F10.1093%2Fbib%2Fbbr031&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=When+orthologs+diverge+between+human+and+mouse&amp;rft.issn=1467-5463&amp;rft.date=2011&amp;rft.volume=&amp;rft.issue=&amp;rft.spage=&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fbib.oxfordjournals.org%2Fcgi%2Fdoi%2F10.1093%2Fbib%2Fbbr031&amp;rft.au=Gharib%2C+W.&amp;rft.au=Robinson-Rechavi%2C+M.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth%2CEvolutionary+Biology%2C+Zoology%2C+Model+Organisms">Gharib, W., &amp; Robinson-Rechavi, M. (2011). When orthologs diverge between human and mouse <span style="font-style: italic;">Briefings in Bioinformatics</span> DOI: <a href="http://dx.doi.org/10.1093/bib/bbr031" rev="review">10.1093/bib/bbr031</a></span></p>
<p><span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=Systematic+Zoology&amp;rft_id=info%3Adoi%2F10.2307%2F2412448&amp;rfr_id=info%3Asid%2Fresearchblogging.org&amp;rft.atitle=Distinguishing+Homologous+from+Analogous+Proteins&amp;rft.issn=00397989&amp;rft.date=1970&amp;rft.volume=19&amp;rft.issue=2&amp;rft.spage=99&amp;rft.epage=&amp;rft.artnum=http%3A%2F%2Fwww.jstor.org%2Fstable%2F2412448%3Forigin%3Dcrossref&amp;rft.au=Fitch%2C+W.&amp;rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CSystematics%2C+%2C+Evolutionary+Biology">Fitch, W. (1970). Distinguishing Homologous from Analogous Proteins <span style="font-style: italic;">Systematic Zoology, 19</span> (2) DOI: <a href="http://dx.doi.org/10.2307/2412448" rev="review">10.2307/2412448</a></span></p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/08/26/of-mice-and-men-or-revisiting-the-ortholog-conjecture/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Short bioinformatics hacks: merging fastq files</title>
		<link>http://bytesizebio.net/index.php/2011/08/25/short-bioinformatics-hacks-merging-fastq-files/</link>
		<comments>http://bytesizebio.net/index.php/2011/08/25/short-bioinformatics-hacks-merging-fastq-files/#comments</comments>
		<pubDate>Thu, 25 Aug 2011 15:52:19 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[second generation sequencing]]></category>
		<category><![CDATA[sequencing]]></category>
		<category><![CDATA[short read sequencing]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5316</guid>
		<description><![CDATA[So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need Biopython installed. &#160; #!/usr/bin/env python from Bio import SeqIO import itertools import sys import os # Copyright(C) 2011 Iddo Friedberg # Released under Biopython [...]]]></description>
			<content:encoded><![CDATA[<p>So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need <a href="http://biopython.org" target="_blank">Biopython</a> installed.</p>
<p>&nbsp;</p>
<pre class="brush:python">#!/usr/bin/env python
from Bio import SeqIO
import itertools
import sys
import os
# Copyright(C) 2011 Iddo Friedberg
# Released under Biopython license. http://www.biopython.org/DIST/LICENSE
# Do not remove this comment
def merge_fastq(fastq_path1, fastq_path2, outpath):
    outfile = open(outpath,"w")
    fastq_iter1 = SeqIO.parse(open(fastq_path1),"fastq")
    fastq_iter2 = SeqIO.parse(open(fastq_path2),"fastq")
    for rec1, rec2 in itertools.izip(fastq_iter1, fastq_iter2):
        SeqIO.write([rec1,rec2], outfile, "fastq")
    outfile.close()

if __name__ == '__main__':
    outpath = "%s.merged.fastq" % os.path.splitext(sys.argv[1])[0]
    merge_fastq(sys.argv[1],sys.argv[2],outpath)</pre>
<p>The neat trick is in line 13, using Python&#8217;s <font face="courier"><b>itertools</b></font> to zip two iterators and loop over them in parallel two fastq records at a time.</p>
<p>How to use this script: download to a file you will call <font face="courier"><b>merge_fastq</b></font> (or whatever). Then:</p>
<pre class="brush:bash">
$ chmod +x merge_fastq
</pre>
<p>And you are ready to go.</p>
<pre class="brush:bash">
$ ./merge_fastq myseq_1_.fastq myseq_2_.fastq
</pre>
<p>The merged file will be called <font type="Courier">myseq_1_.merged.fastq</font></p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/08/25/short-bioinformatics-hacks-merging-fastq-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tweets from AFP/CAFA 2011</title>
		<link>http://bytesizebio.net/index.php/2011/07/23/tweets-from-afpcafa-2011/</link>
		<comments>http://bytesizebio.net/index.php/2011/07/23/tweets-from-afpcafa-2011/#comments</comments>
		<pubDate>Sat, 23 Jul 2011 16:03:03 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[blogging]]></category>
		<category><![CDATA[Social media]]></category>
		<category><![CDATA[meeting]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5263</guid>
		<description><![CDATA[The AFP/CAFA 2011 meeting was held on July 15 and July 16. Yes, it was a huge success, and I&#8217;m not just saying that beacuse I am one of the organizers.  I will write up something more comprehensive soon; in the meantime, here are my tweets from the meeting. I am learning a lot about [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://bytesizebio.net/index.php/2011/07/02/cafa-update/" target="_blank">AFP/CAFA 2011</a> meeting was held on July 15 and July 16. Yes, it was a huge success, and I&#8217;m not just saying that beacuse I am one of the organizers.  I will write up something more comprehensive soon; in the meantime, here are my tweets from the meeting.</p>
<p>I am learning a lot about scavenging tweets.  Apparently, I cannot go back beyond a few days using the <strong><a href="https://dev.twitter.com/docs/api/1/get/search" target="_blank">api.search()</a></strong> function. Hence, if I try to search for all the #AFPCAFA11 hashtags I will get nothing from the meeting&#8217;s dates. But if I look for a user&#8217;s tweets using <strong>api.user_timeline()</strong> I can go back for months on the users timeline, and then filter out the tweets with the relevant hashtags.  Since it seems I was the principal twiterrer in that meeting, I&#8217;m putting up my tweets here. Apologies to the others who recorded the meeting using Twitter: if you want your tweets included, drop me a line with your Twitter user name.</p>
<p>&nbsp;</p>
<p>Thu Jul 14 16:04:22 2011 At Vienna. #ISMB #AFPCAFA11 #ISMB11<br />
Thu Jul 14 16:07:34 2011 At Vienna #ISMB #AFPCAFA11 Curious who is the best protein function predictor? Join us. http://bit.ly/htv3J7<br />
Fri Jul 15 09:10:24 2011 Jesse Gillis from UBC on a function prediciton post-mortem #AFPCAFA11 #ISMB<br />
Fri Jul 15 09:12:11 2011 This is going to be fun. Jesse Gillis UBC. Postmortem on MouseFunc #ISMB #AFPCAFA11 Precision / Recall of 0.06&#8230; argh.<br />
Fri Jul 15 09:14:14 2011 http://bit.ly/obSISi MouseFunc experiment #AFPCAFA11 #ISMB<br />
Fri Jul 15 09:28:58 2011 Multifunctionality affect prediction profoundly. Take-home message from Jesse Gillis&#8217; talk #AFPCAFA11 #ISMB<br />
Fri Jul 15 09:31:22 2011 Next up: Meghana Chitale from @kiharalab #ISMB #AFPCAFA11<br />
Fri Jul 15 09:36:44 2011 Co-occurrence association scores. CAS Lookiong for associations across GOs: between a BPO term and a CCO term, for example. #AFPCAFA11<br />
Fri Jul 15 09:41:17 2011 Missing enzyme predictions. My fav. Chtiale at #ISMB #AFPCAFA11<br />
Fri Jul 15 09:56:08 2011 Yanay Ofran from Bar ilan U about multifunctionality. How to assess the number of false positives? #ISMB #AFPCAFA11<br />
Fri Jul 15 09:56:44 2011 Prediciton of photosynthesis in an elephant genome is a good sign of false positives. Yanay #ISMB #AFPCAFA11<br />
Fri Jul 15 10:02:56 2011 Precision of short motifs is surprisingly high. Yanay, #ISMB #AFPCAFA11<br />
Fri Jul 15 10:05:42 2011 short motifs identify functional motifs. Whereas homology identifies evolutionary relatedness. #AFPCAFA11 #ISMB<br />
Fri Jul 15 10:09:14 2011 Next up: me. #AFPCAFA11 #ISMB<br />
Fri Jul 15 11:44:45 2011 David Jones from UCL is talking about his #AFPCAFA11 predictions. Many different features. #ISMB<br />
Fri Jul 15 11:47:18 2011 profile-profile fold recognition works well in Function prediction as well. #AFPCAFA11 #ISMB<br />
Fri Jul 15 11:53:02 2011 Jones talking about why it is unhealthy to exercise in the morning. #AFPCAFA11 #ISMB generation of free radicals. #excusesaregreat<br />
Fri Jul 15 11:56:43 2011 49,000 features in an SVM. #ISMB #AFPCAFA11<br />
Fri Jul 15 11:57:47 2011 Hard to believe no redundancy in 49K features&#8230;. #ISMB #AFPCAFA11<br />
Fri Jul 15 11:59:17 2011 &#8220;I like this plot and I would make a t-shirt out of it, but in terms of scientific value its worth is zero&#8221;. Jones, #AFPCAFA11 #ISMB<br />
Fri Jul 15 12:01:09 2011 New term heard for a 2nd time at #ISMB #AFPCAFA11 &#8220;postdiction&#8221; as opposed to &#8220;prediction&#8221;. #notsurewhatitmeans<br />
Fri Jul 15 12:16:49 2011 Lightning talks at #AFPCAFA11 #ISMB starting now<br />
Fri Jul 15 12:35:51 2011 Mary Jo Ondrechen on SALSA at #AFPCAFA11 #ISMB structure &#8211;&amp;gt; function.<br />
Fri Jul 15 12:36:45 2011 CDEHKRY make up 75% of all catalytic sites. Mary Jo #ISMB #AFPCAFA11<br />
Fri Jul 15 12:50:50 2011 Jeffrey Yunes from UC Berkeley on SIFTER from Steven Brenner&#8217;s lab. #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:28:20 2011 Patrik Koskinen on a function prediction method called PANNZER. This will roll over well. #ISMB #AFPCAFA11<br />
Fri Jul 15 14:30:30 2011 More information on PANNZER and the other methods at #AFPCAFA11 here: http://bit.ly/l9ayW9 #ISMB11<br />
Fri Jul 15 14:33:16 2011 Koskinen mentioning Biothesaurus http://bit.ly/rnHdzp which removes errors due to synonyms #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:43:09 2011 Question: &#8220;How do you pronounce the name of that volcano that erupted in Iceland?&#8221; Answer: &#8220;I don&#8217;t&#8221;. Koskinen #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:50:27 2011 Olivier Lichtarge on using Evolutionary Trace Annotation (ETA) for function prediction. #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:51:12 2011 A network of protein structure networks. Memories of fragnostic. #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:52:49 2011 Using network diffusion to annotate protein structures. Lichtargee. #AFPCAFA11 #ISMB<br />
Fri Jul 15 14:57:19 2011 compressing a clique to a star graph by adding a pseudo-node. Reduces problem from O(n^2) to O(n). Lichtarge #ISMB #AFPCAFA11<br />
Fri Jul 15 15:09:17 2011 Amos Bairoch: of prosite, swissprot and expasy fame #AFPCAFA11 #ISMB<br />
Fri Jul 15 15:12:51 2011 Due to alt-splicing and PTM 20,000 human genes &#8211;&amp;gt; 5M different molecules! Bairoch #AFPCAFA11 #ISMB<br />
Fri Jul 15 15:15:50 2011 Status codes of human protein function annotations: Maybe, potentially, putative, expected and hopefullly. Bairoch #ISMB #AFPCAFA11<br />
Fri Jul 15 15:17:10 2011 &amp;gt;100 GPCRs for which we do not know the ligand. Bairoch #ISMB #AFPCAFA11<br />
Fri Jul 15 15:19:13 2011 Bairoch now talking about CALIPHO. 1)experimental verification of human protein function; 2)enable bioinformatics for same. #ISMB #AFPCAFA11<br />
Fri Jul 15 15:20:44 2011 &#8220;How many ppl in this room have never used swissprot&#8221;. 0. #AFPCAFA11 #ISMB<br />
Fri Jul 15 15:23:24 2011 Bairoch looks at small &amp;lt;100aa intracellular protiens in experimental assays. #AFPCAFA11 #ISMB<br />
Fri Jul 15 15:26:26 2011 Interesting proteins are expressed in olfactory pits of zebrafish. I didn&#8217;t know fish smell. #AFPCAFA11 #ISMB<br />
Fri Jul 15 15:33:09 2011 If you get a new function, you cannot predict it, because of no ontology (yet). #AFPCAFA11 #ISMB<br />
Sat Jul 16 06:58:43 2011 Predrag Radivojac explaining the vagaries of GO annotated databases #AFPCAFA11 #ISMB<br />
Sat Jul 16 06:59:48 2011 Assessment of protein function prediction methods going on now in Hall L #ISMB #AFPCAFA11<br />
Sat Jul 16 07:04:08 2011 Radivojac: &#8220;It&#8217;s possible to achieve a precision of 1, it just won&#8217;t happen&#8221;. #AFPCAFA11 #ISMB<br />
Sat Jul 16 07:29:50 2011 Assessment of protein function prediction methods going on now in Hall L #ISMB #AFPCAFA11 http://bit.ly/htv3J7 Sean Mooney is up.<br />
Sat Jul 16 08:05:49 2011 Christine Orengo on her team&#8217;s work at #AFPCAFA11<br />
Sat Jul 16 08:06:28 2011 Orengo says that they are not really function predictors, but evolutionary classifiers #AFPCAFA11 #ISMB<br />
Sat Jul 16 09:18:03 2011 Shaneka Simmon from Jackson State on predicting functions associated with biofeuls &#8211; universal stress protein domains. #AFPCAFA11 #ISMB<br />
Sat Jul 16 09:21:30 2011 Simmons: look for diversity of universal stress response genes. in Rhodopseudomonas palustris. #AFPCAFA11 #ISMB11<br />
Sat Jul 16 11:33:52 2011 http://yfrog.com/kevdmbqj #AFPCAFA11 #ismb discussion panel now<br />
Sat Jul 16 14:46:07 2011 Wyatt shows that paralogs actually give better annotation transfer than orthologs. http://bit.ly/nWShuB #AFPCAFA11 #ISMB<br />
Sat Jul 16 14:47:17 2011 Wyatt&#8217;s claim runs contrary to common wisdom. Which is good <img src='http://bytesizebio.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  http://bit.ly/nWShuB #AFPCAFA11 #ISMB<br />
Thu Jul 21 22:42:34 2011 After all the talk about standards at #AFPCAFA11 #ISMB this is very timely: http://xkcd.com/927/</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/07/23/tweets-from-afpcafa-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ISMB 2011 tweets</title>
		<link>http://bytesizebio.net/index.php/2011/07/22/ismb-2011-tweets/</link>
		<comments>http://bytesizebio.net/index.php/2011/07/22/ismb-2011-tweets/#comments</comments>
		<pubDate>Fri, 22 Jul 2011 21:59:51 +0000</pubDate>
		<dc:creator>Iddo</dc:creator>
				<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Social media]]></category>
		<category><![CDATA[ISMB]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://bytesizebio.net/?p=5249</guid>
		<description><![CDATA[ISMB this year had quite a few twiterrers. Hashtag: #ISMB. I tried to collect all the #ISMB tweets, so I wrote my own twitter scavenger script, but it seems to go only 3 days back.  I am not sure if this is a Twitter feature, or something with the library I am using (tweepy) or [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.iscb.org/ismbeccb2011" target="_blank">ISMB</a> this year had quite a few twiterrers. Hashtag: #ISMB. I tried to collect all the #ISMB tweets, so I wrote my own twitter scavenger script, but it seems to go only 3 days back.  I am not sure if this is a Twitter feature, or something with the library I am using (tweepy) or my own Twitter API programming nOObness. In any case, here are the Tweets, July 19-22.  Unfortunately, I could not get those from July 17-19th. These are mostly from Michael Ashburner&#8217;s closing Keynote.</p>
<p>I will put the code for the application up later.  (Temporary name: Vulture.)</p>
<p>Fri Jul 22 21:00:20 2011 pjacock #BOSC2011 (pre #ISMB SIG) notes by @chapmanb &#8211; Day Two pm: #Semantic Web session, and misc #opensource project session http://t.co/TDjmx6B<br />
Fri Jul 22 20:58:53 2011 pjacock #BOSC2011 (pre #ISMB SIG) notes by @chapmanb &#8211; Day Two am: Amazon&#8217;s Matt Wood @mza keynote, #cloud computing session http://t.co/BaiOz5W<br />
Fri Jul 22 20:55:41 2011 pjacock #BOSC2011 (pre #ISMB SIG) notes by @chapmanb &#8211; Day One pm: Visualization &amp;amp; #NGS sessions http://t.co/JuvLrex<br />
Fri Jul 22 20:54:45 2011 pjacock #BOSC2011 (pre #ISMB SIG) notes by @chapmanb &#8211; Day One am: Larry Hunter&#8217;s keynote, and Genome Content Management session http://t.co/gfdJLv7<br />
Fri Jul 22 20:48:57 2011 pjacock MT @GeneWikiPulse bio-ontologies 2011 &#8211; ontologies need lexicons: I recently returned from my first #ISMB in Vienna, &#8230; http://t.co/kMCNaxh<br />
Fri Jul 22 20:09:38 2011 suganthibala RT @keesvanbochove: Closing remark of Michael Ashburner, a godfather of #bioinformatics, at #ismb: don&#8217;t be proprietary about your research, be open.<br />
Fri Jul 22 17:59:59 2011 YiliangDing RT @GenomeBiology: Ana Conesa: Seq depth affects the relative proportions of RNA classes in RNA-seq datasets #ISMB #genomics<br />
Fri Jul 22 14:20:50 2011 BioCatalogue RT @fiedawn: #ismb #biosharing Carole Goble &#8211; we have a lot of descriptions in the BioCatalogue &#8211; how can we mark up/export to bioDBCore? underway<br />
Fri Jul 22 13:25:30 2011 razoralign RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 11:52:01 2011 druvus RT @keesvanbochove: Good shows how they linked SNPedia with Gene Wiki articles using Semantic MediaWiki. Almost no overlap of gene-disease rels in wikis!! #ismb<br />
Fri Jul 22 11:50:41 2011 druvus RT @_lrr_: Simple yet very handy tool: fastapl (http://bit.ly/rhkKCl), had a poster at #ismb<br />
Fri Jul 22 11:43:00 2011 coding_doc RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 09:07:58 2011 jackwillstone RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 09:00:55 2011 raunakms RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 09:00:46 2011 GenomeMedicine RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 09:00:31 2011 BioMedCentral RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 07:48:42 2011 sharmanedit RT @GenomeBiology Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Fri Jul 22 01:08:29 2011 SemanticMW RT @keesvanbochove: Good shows how they linked SNPedia with Gene Wiki articles using Semantic MediaWiki. Almost no overlap of gene-disease rels in wikis!! #ismb<br />
Thu Jul 21 22:42:34 2011 iddux After all the talk about standards at #AFPCAFA11 #ISMB this is very timely: http://xkcd.com/927/<br />
Thu Jul 21 19:30:33 2011 druvus RT @GenomeBiology: Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Thu Jul 21 17:10:44 2011 GenomeBiology Read Genome Biology&#8217;s take on ISMB 2011 on the BMC blog http://t.co/dmuhQ7J #genomics #ISMB<br />
Thu Jul 21 15:01:18 2011 bffo on way home 2 YYZ, leaving FRA on @aircanada, ~2 weeks on road. VIE for #ISMB and Kyoto (via KIX) 4 #ICGC. on plane feels like we r home<br />
Thu Jul 21 13:54:11 2011 pjacock To all those who thanked me for #bosc2011 / #ISMB / #eccb2011 (live) tweets, no problem &#8211; it was an interesting experiment in note taking.<br />
Thu Jul 21 12:36:39 2011 _lrr_ Simple yet very handy tool: fastapl (http://bit.ly/rhkKCl), had a poster at #ismb<br />
Thu Jul 21 10:55:34 2011 druvus RT @dullhunk: Are videos of the #ISMB keynotes by Ashburner, Berger, Thornton, Serrano, Troyanskaya and Valencia online anywhere? #bioinformatics<br />
Thu Jul 21 10:54:17 2011 druvus RT @GigaScience: Another applied example of #usegalaxy from Marc Bras, with a pipeline for SNP detection in the grapevine: MAPHiTS http://t.co/yVQ3POt #ISMB<br />
Thu Jul 21 10:44:54 2011 bffo MT @GigaScience: #ISMB write-up in GigaBlog: Final impressions of Vienna: time 2 go w/t (work)flow… http://bit.ly/r6JaN7 #bioinformatics<br />
Thu Jul 21 08:49:27 2011 fredebibs RT @druvus: ISMB/ECCB 2011 proceedings papers are online http://t.co/CkERlvb #ismb<br />
Thu Jul 21 08:01:27 2011 iainh_z RT @GigaScience: Another #ISMB write-up in GigaBlog: Final impressions of Vienna: time to go with the (work)flow… http://t.co/ZVluBMa<br />
Thu Jul 21 05:49:15 2011 heikkil RT @GigaScience: Another applied example of #usegalaxy from Marc Bras, with a pipeline for SNP detection in the grapevine: MAPHiTS http://t.co/yVQ3POt #ISMB<br />
Thu Jul 21 03:47:52 2011 galaxyproject RT @GigaScience: Another applied example of #usegalaxy from Marc Bras, with a pipeline for SNP detection in the grapevine: MAPHiTS http://t.co/yVQ3POt #ISMB<br />
Thu Jul 21 03:45:10 2011 galaxyproject RT @GigaScience: Another #ISMB write-up in GigaBlog: Final impressions of Vienna: time to go with the (work)flow… http://t.co/ZVluBMa<br />
Thu Jul 21 02:43:42 2011 terryogara From chaos, beauty. RT @iGenomics: RW: Rather to fight noise, use noise to make a robust system. Add an oscillator to the system. #ismb<br />
Wed Jul 20 22:03:00 2011 chlalanne RT @druvus: ISMB/ECCB 2011 proceedings papers are online http://t.co/CkERlvb #ismb<br />
Wed Jul 20 21:30:05 2011 jxtx RT @GigaScience: Another #ISMB write-up in GigaBlog: Final impressions of Vienna: time to go with the (work)flow… http://t.co/ZVluBMa<br />
Wed Jul 20 21:25:48 2011 GigaScience Another #ISMB write-up in GigaBlog: Final impressions of Vienna: time to go with the (work)flow… http://t.co/ZVluBMa<br />
Wed Jul 20 20:47:27 2011 aadhyadi RT @ianholmes: &amp;quot;Don&#8217;t be afraid to work with people smarter than you. And work with people you like.&amp;quot; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 20:40:34 2011 druvus ISMB/ECCB 2011 proceedings papers are online http://t.co/CkERlvb #ismb<br />
Wed Jul 20 20:23:34 2011 olgabot @nerdgirls #ismb #ECCB11 was fantastic! Gained deep knowledge in my field, learned about new developments, and made some great friends<br />
Wed Jul 20 20:23:19 2011 _lrr_ RT @RishaNarayan: Really enjoyed the ISMB/ECCB 2011 conference. Many thanks to the organisers. #ismb<br />
Wed Jul 20 20:15:09 2011 lopantano RT @ianholmes: In closing #ismb keynote, Mike Ashburner quoted English philosopher(?) &amp;quot;Science is public knowledge. If it is not open, it is not science.&amp;quot;<br />
Wed Jul 20 19:59:57 2011 _lrr_ Had a great party at flex.at! RT @SaggiSardar: ISMB over for another year. Great conference. Now it&#8217;s time to enjoy Vienna! #ISMB<br />
Wed Jul 20 18:39:15 2011 NatureRevGenet Congrats due to all the #ismb speakers for invariably clear and engaging talks<br />
Wed Jul 20 16:46:36 2011 yeastgenome RT @bffo:Cherry,Ashburner and Blake; Sc, Dm &amp;amp; Mm, at the beginning of GO #ISMB http://yfrog.com/kl70850759j indeed a great closing lecture<br />
Wed Jul 20 16:37:08 2011 dullhunk Are videos of the #ISMB keynotes by Ashburner, Berger, Thornton, Serrano, Troyanskaya and Valencia online anywhere? #bioinformatics<br />
Wed Jul 20 16:00:19 2011 bffo Cherry, Ashburner and Blake; Sc, Dm &amp;amp; Mm, at the beginning of GO #ISMB http://yfrog.com/kl70850759j indeed a great closing lecture<br />
Wed Jul 20 14:12:39 2011 tsucheta RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Wed Jul 20 14:00:21 2011 tsucheta RT @ianholmes: &amp;quot;Don&#8217;t be afraid to work with people smarter than you. And work with people you like.&amp;quot; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 13:49:25 2011 avilella RT @ianholmes: Mike Ashburner claims &amp;quot;I never had to look for a job, and I&#8217;ve never been short of funding&amp;quot;. Apparently 1970&#8242;s UK = science paradise #ismb<br />
Wed Jul 20 12:52:39 2011 cbjones1943 RT @ianholmes: Once again I recommend Ashburner&#8217;s book &amp;quot;Won for All&amp;quot; for more such dirt-dishing details of millenial genomics http://t.co/otaIgvZ #ismb<br />
Wed Jul 20 12:46:58 2011 genomeresearch RT @ianholmes: Once again I recommend Ashburner&#8217;s book &amp;quot;Won for All&amp;quot; for more such dirt-dishing details of millenial genomics http://t.co/otaIgvZ #ismb<br />
Wed Jul 20 12:45:48 2011 cshperspectives Ashburner: Gerry Rubin &amp;quot;to his credit&amp;quot; recognized the merit of Venter&#8217;s approach but &amp;quot;Venter nearly screwed us&amp;quot; #ismb via @ianholmes<br />
Wed Jul 20 12:21:24 2011 coding_doc RT @ianholmes: In closing #ismb keynote, Mike Ashburner quoted English philosopher(?) &amp;quot;Science is public knowledge. If it is not open, it is not science.&amp;quot;<br />
Wed Jul 20 12:17:39 2011 gthorisson Cannot believe my luck: flight from #ismb Vienna landed at Heathrhrow 25min early, so able to catch early bus to Oxford to meet colleagues<br />
Wed Jul 20 11:47:34 2011 ClaireAinsworth RT @dgmacarthur: Check out @ianholmes for highlights of what sounds like a brilliant closing #ISMB keynote by Michael Ashburner.<br />
Wed Jul 20 11:44:55 2011 sharmanedit RT @ianholmes: Once again I recommend Ashburner&#8217;s book &amp;quot;Won for All&amp;quot; for more such dirt-dishing details of millenial genomics http://t.co/otaIgvZ #ismb<br />
Wed Jul 20 11:43:29 2011 sharmanedit Follow @ianholmes now for great live tweeting of Michael Ashburner&#8217;s fascinating closing keynote at #ismb 2011<br />
Wed Jul 20 11:30:14 2011 ianholmes Once again I recommend Ashburner&#8217;s book &amp;quot;Won for All&amp;quot; for more such dirt-dishing details of millenial genomics http://t.co/otaIgvZ #ismb<br />
Wed Jul 20 11:28:56 2011 ianholmes Ashburner gave credit to Suzi Lewis for the idea of 1999 Drosophila annotation jamboree &amp;amp; to Celera for funding it: &amp;quot;cheap consulting&amp;quot; #ismb<br />
Wed Jul 20 11:27:59 2011 17thescorpion RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Wed Jul 20 11:27:20 2011 KatherineMejia RT @ianholmes: &amp;quot;Don&#8217;t be afraid to work with people smarter than you. And work with people you like.&amp;quot; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 11:27:18 2011 ianholmes Ashburner&#8217;s early sequencing stories: getting Nature paper for ~5kb of fly DNA. Being scooped after talking about a gene to competitor #ismb<br />
Wed Jul 20 11:24:58 2011 ianholmes Ashburner never intended to go into compbio. Even genetics was an accident. Recounted early adventures w/Phoenix, Vax, ARPAnet, gopher #ismb<br />
Wed Jul 20 11:19:50 2011 dgmacarthur Check out @ianholmes for highlights of what sounds like a brilliant closing #ISMB keynote by Michael Ashburner.<br />
Wed Jul 20 11:19:06 2011 gilbertjacka RT @ianholmes: Mike Ashburner claims &amp;quot;I never had to look for a job, and I&#8217;ve never been short of funding&amp;quot;. Apparently 1970&#8242;s UK = science paradise #ismb<br />
Wed Jul 20 11:18:10 2011 ianholmes Mike Ashburner claims &amp;quot;I never had to look for a job, and I&#8217;ve never been short of funding&amp;quot;. Apparently 1970&#8242;s UK = science paradise #ismb<br />
Wed Jul 20 11:17:08 2011 gawbul RT @ianholmes: Mike Ashburner noted that his own hacker, Aubrey de Grey, achieved &amp;quot;later notoriety as a gerontologist&amp;quot; #ismb<br />
Wed Jul 20 11:16:10 2011 systemsbiology RT @ianholmes: Mike Ashburner turned down by Cambridge..for undergrad deg; took pleasure in turning down Zoology Dept chair 20yrs on #ismb<br />
Wed Jul 20 11:15:23 2011 gawbul RT @ianholmes: Ashburner said Drosophila was his second love affair #ismb<br />
Wed Jul 20 11:15:14 2011 gawbul RT @ianholmes: Ashburner accepted to do Genetics degree at Cambridge. Got married, learned genetics in a Naples lab that summer. Came back did degree #ismb<br />
Wed Jul 20 11:14:05 2011 gawbul RT @ianholmes: Mike Ashburner was turned down by Cambridge Zoology for undergrad degree; took pleasure in turning down Zoology Dept chair 20yrs on #ismb<br />
Wed Jul 20 11:12:58 2011 gawbul RT @ianholmes: In talking about Gene Ontology, Ashburner praised co-founders, as well as Barry Smith: &amp;quot;initially annoying&amp;quot; but &amp;quot;a true philosopher&amp;quot; #ismb<br />
Wed Jul 20 11:12:54 2011 gawbul RT @ianholmes: &amp;quot;Don&#8217;t be afraid to work with people smarter than you. And work with people you like.&amp;quot; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 11:12:51 2011 ianholmes Mike Ashburner noted that his own hacker, Aubrey de Grey, achieved &amp;quot;later notoriety as a gerontologist&amp;quot; #ismb<br />
Wed Jul 20 11:12:27 2011 gawbul RT @ianholmes: In closing #ismb keynote, Mike Ashburner quoted English philosopher(?) &amp;quot;Science is public knowledge. If it is not open, it is not science.&amp;quot;<br />
Wed Jul 20 11:11:51 2011 ianholmes Ashburner&#8217;s #ismb pioneers: Roger Staden (Fred Sanger&#8217;s hacker), Doug Brutlag (Stanford compbio archive), Amos Bairoch (1st bio webserver)<br />
Wed Jul 20 11:07:57 2011 ianholmes &amp;quot;Barry Smith &#8211; you either love him or you hate him&amp;quot; &#8211; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 11:06:55 2011 ianholmes &amp;quot;Don&#8217;t be afraid to work with people smarter than you. And work with people you like.&amp;quot; Michael Ashburner, closing keynote, #ismb 2011<br />
Wed Jul 20 11:05:44 2011 ianholmes In talking about Gene Ontology, Ashburner praised co-founders, as well as Barry Smith: &amp;quot;initially annoying&amp;quot; but &amp;quot;a true philosopher&amp;quot; #ismb<br />
Wed Jul 20 11:04:44 2011 ilpuccio Reordering ideas after #BOSC2011 and #ISMB<br />
Wed Jul 20 11:03:30 2011 ianholmes Ashburner clearly enjoyed his time at CalTech (think this was postdoc?) and mentioned Feynman &#8211; wonder if they met #ismb<br />
Wed Jul 20 11:01:51 2011 ianholmes Ashburner said Drosophila was his second love affair #ismb<br />
Wed Jul 20 11:01:27 2011 ianholmes Ashburner accepted to do Genetics degree at Cambridge. Got married, learned genetics in a Naples lab that summer. Came back did degree #ismb<br />
Wed Jul 20 10:59:57 2011 ianholmes When Ashburner first invited to CalTech &amp;quot;I thought in my ignorance, why go to an Institute of Tech? Didn&#8217;t realize home of Drosophila&amp;quot; #ismb<br />
Wed Jul 20 10:58:42 2011 ianholmes Gerry Rubin &amp;quot;to his great credit&amp;quot; recognized merit of Craig Venter&#8217;s approach to sequencing. But &amp;quot;Venter nearly screwed us&amp;quot; -Ashburner #ismb<br />
Wed Jul 20 10:56:57 2011 ianholmes Mike Ashburner was turned down by Cambridge Zoology for undergrad degree; took pleasure in turning down Zoology Dept chair 20yrs on #ismb<br />
Wed Jul 20 10:56:18 2011 wimufi RT @ianholmes: In closing #ismb keynote, Mike Ashburner quoted English philosopher(?) &amp;quot;Science is public knowledge. If it is not open, it is not science.&amp;quot;<br />
Wed Jul 20 10:53:13 2011 ianholmes Mike Ashburner&#8217;s #ismb keynote concluded with &amp;quot;lessons&amp;quot;. It&#8217;s all about luck. But must exploit that luck. Get a guru/mentor/parent. Be open.<br />
Wed Jul 20 10:51:18 2011 ianholmes In fine (blissfully un-PC) form, Mike Ashburner also relished stories of competition with &amp;quot;the Germans&amp;quot;, &amp;quot;the Spanish&amp;quot; etc. #ismb<br />
Wed Jul 20 10:50:23 2011 ianholmes Ashburner also noted (and NB I&#8217;m not endorsing every quote! but his #ismb keynote WAS juicy) that he&#8217;d never be an American citizen.<br />
Wed Jul 20 10:49:05 2011 ianholmes Mike Ashburner was offered a UK biowar job out of college, but refused. Wonders now if he would&#8217;ve got security clearance. Hopes not. #ismb<br />
Wed Jul 20 10:47:53 2011 ianholmes In closing #ismb keynote, Mike Ashburner quoted English philosopher(?) &amp;quot;Science is public knowledge. If it is not open, it is not science.&amp;quot;<br />
Wed Jul 20 10:45:13 2011 ianholmes To say Gene Ontology received coolly at #ismb Halkidiki in &#8217;97 an understatement; Ashburner &amp;quot;almost laughed out of room&amp;quot; by Sander, Durbin<br />
Wed Jul 20 10:43:23 2011 ianholmes I highly recommend Ashburner&#8217;s book &amp;quot;Won for All&amp;quot;. Candid autobiography of the millenial corporate genome wars http://t.co/otaIgvZ #ismb #fb<br />
Wed Jul 20 10:43:10 2011 druvus I have enjoyed following #ismb tweets from my sunny garden<br />
Wed Jul 20 10:40:34 2011 ianholmes I *really* enjoyed Michael Ashburner&#8217;s closing #ISMB keynote. I was unable to tweet at the time but will record some highlights.<br />
Wed Jul 20 10:40:11 2011 ianholmes Nice &#8211;&amp;gt; @nutrigenomics @iGenomics Ron Weiss: Don&#8217;t fight noise; use it to make gene circuits robust. Add an oscillator to the system. #ismb<br />
Wed Jul 20 10:33:54 2011 ianholmes RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Wed Jul 20 10:33:51 2011 ianholmes RT @iGenomics: Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Wed Jul 20 08:59:25 2011 samuellampa RT @iGenomics: A Bioinformatics training resource: http://www.biotnet.org/ #ismb<br />
Wed Jul 20 08:37:30 2011 biomol_info RT @iGenomics: A Bioinformatics training resource: http://www.biotnet.org/ #ismb<br />
Wed Jul 20 08:25:30 2011 gthorisson On my way home from #ISMB. Vienna airport is crowded, stuffy and not very pleasant. But, it has free wireless Internet so it ROX.<br />
Wed Jul 20 06:56:48 2011 bffo block dates for #ISMB next year (and surrounding dates for SIGs): July 15-17 2012 , this year was great, c u all in LA next year!<br />
Wed Jul 20 06:28:06 2011 iliasSkalk RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Wed Jul 20 06:27:46 2011 iliasSkalk RT @iGenomics: Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Wed Jul 20 01:33:31 2011 yeastgenome RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Wed Jul 20 01:32:27 2011 yeastgenome RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Wed Jul 20 00:00:26 2011 robymuhamad RT @timjph: @iGenomics here&#8217;s report I mentioned: 3.8b$ on human genome project; 310k jobs created &amp;amp; ~800b$ economic impact http://j.mp/rbXSbk #ismb<br />
Tue Jul 19 23:53:57 2011 robaganrab RT @iGenomics: Michael Ashburner: Chance happens in life. You have to exploit it #ismb<br />
Tue Jul 19 23:38:41 2011 scroeser RT @_lrr_: Michael Ashburner: &amp;quot;If knowledge is not public it is not science. Period.&amp;quot; #ismb #eccb11 #openscience<br />
Tue Jul 19 22:53:00 2011 ematsen RT @iddux: HumanN, a new metagenomic analysis package from Huttenhower http://j.mp/pHkWhk #ISMB<br />
Tue Jul 19 22:36:32 2011 SaggiSardar ISMB over for another year. Great conference. Now it&#8217;s time to enjoy Vienna! #ISMB<br />
Tue Jul 19 22:30:06 2011 aaronquinlan RT @iGenomics: Michael Ashburner: Follow Syndey Brenner, he will not use slides because a good phrase is worth thousand of Powerpoint slides. #ismb<br />
Tue Jul 19 21:09:21 2011 aadhyadi RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 21:09:00 2011 chofski RT @SUPERFAMILY: Congratulations to Adam Sarder on winning the outstanding poster award at #ISMB (it&#8217;s poster F22 if you want to see it!)<br />
Tue Jul 19 20:42:57 2011 wspoonr RT @pjacock: Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openscience<br />
Tue Jul 19 19:36:10 2011 simon_andrews After all of the #ISMB delegates have left, our hotel wifi mysteriously starts working again.<br />
Tue Jul 19 19:14:33 2011 KamounLab RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 19:14:24 2011 KamounLab RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 18:57:14 2011 julientap RT @iddux: HumanN, a new metagenomic analysis package from Huttenhower http://j.mp/pHkWhk #ISMB<br />
Tue Jul 19 18:49:04 2011 alalejandro RT @iGenomics: Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Tue Jul 19 18:48:56 2011 alalejandro RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 18:48:07 2011 alalejandro If only people would understand how essential this is! MT @keesvanbochove Ashburner #ismb: don&#8217;t be proprietary abt your research, be open.<br />
Tue Jul 19 18:46:17 2011 RishaNarayan Really enjoyed the ISMB/ECCB 2011 conference. Many thanks to the organisers. #ismb<br />
Tue Jul 19 18:42:56 2011 caseybergman RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 18:40:34 2011 bergmanlab RT @olgabot: #ismb &amp;quot;If knowledge is not public it is not science&amp;quot; &#8211; Michael Ashburner<br />
Tue Jul 19 18:40:33 2011 caseybergman RT @olgabot: #ismb &amp;quot;If knowledge is not public it is not science&amp;quot; &#8211; Michael Ashburner<br />
Tue Jul 19 18:31:02 2011 iddux HumanN, a new metagenomic analysis package from Huttenhower http://j.mp/pHkWhk #ISMB<br />
Tue Jul 19 18:19:10 2011 iddux @_lrr_: @fionabrinkman learned at #ismb that what we call &amp;quot;analogs&amp;quot; today used to be called &amp;quot;orthologs&amp;quot; by 1s… (cont) http://deck.ly/~yFYyV<br />
Tue Jul 19 18:18:07 2011 iddux @_lrr_: @fionabrinkman learned at #ismb that what we call &amp;quot;orthologs&amp;quot; today used to be called &amp;quot;analogs&amp;quot; by 1st evolutionary scientists.<br />
Tue Jul 19 17:27:04 2011 4a6a5a RT @simon_andrews: 2 people have now confused me with Simon Anders (DeSeq etc) I should learn more about RNASeq to be able to bluff better. #ismb<br />
Tue Jul 19 17:24:43 2011 Accelrys RT @Fabio11211 #ISMB conf draws to an end! Strong presence &amp;amp; interest at Accelrys booth for NextGenSeq &amp;amp; Protein Modelling solutions<br />
Tue Jul 19 17:19:48 2011 Chris_Evelo MT @pjacock: Gary Bader suggested http://bit.ly/mPS9LS for shared #bioinformatics tutorials, eg has introduction Cytoscape #ISMB Workshop<br />
Tue Jul 19 17:17:55 2011 suknamgoong RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 17:16:43 2011 cytoscape RT @pjacock: Gary Bader suggested http://t.co/jNFVbjr as one project to shared #bioinformatics tutorials, eg has introduction Cytoscape #ISMB Workshop<br />
Tue Jul 19 17:02:34 2011 BetaScience @bgood Wish I was at #ismb with you. I have some great memories from that conference and #Vienna. #pre-kids<br />
Tue Jul 19 16:55:50 2011 betsyrolland RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 16:50:41 2011 jpatanooga RT @keesvanbochove: Closing remark of Michael Ashburner, a godfather of #bioinformatics, at #ismb: don&#8217;t be proprietary about your research, be open.<br />
Tue Jul 19 16:22:44 2011 gailfdavies RT @olgabot: #ismb &amp;quot;If knowledge is not public it is not science&amp;quot; &#8211; Michael Ashburner<br />
Tue Jul 19 16:20:55 2011 iGenomics @westr Yes, indeed. Michael Ashburner borrowed it to tell 3 life lessons. #ismb<br />
Tue Jul 19 16:18:38 2011 pierrepo RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 16:12:37 2011 FigShare RT @olgabot: #ismb &amp;quot;If knowledge is not public it is not science&amp;quot; &#8211; Michael Ashburner<br />
Tue Jul 19 16:12:19 2011 tweetruth RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 16:11:49 2011 FigShare RT @keesvanbochove: Closing remark of Michael Ashburner, a godfather of #bioinformatics, at #ismb: don&#8217;t be proprietary about your research, be open.<br />
Tue Jul 19 16:11:22 2011 kshameer RT @antidemagogue: MA: Knowledge which is not public is not science, period. #ismb #bioinformatics #genomics<br />
Tue Jul 19 16:11:08 2011 iscbsc for those student in Vienna, at 8 at stephanplatz #scs2011 #ISMB<br />
Tue Jul 19 16:11:04 2011 kshameer RT @keesvanbochove: Closing remark of Michael Ashburner, a godfather of #bioinformatics, at #ismb: don&#8217;t be proprietary about your research, be open.<br />
Tue Jul 19 16:04:14 2011 c_rdzl RT @iGenomics: Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Tue Jul 19 16:03:00 2011 ffalconi RT @iGenomics: Michael Ashburner: Chance happens in life. You have to exploit it #ismb<br />
Tue Jul 19 16:01:34 2011 science3point0 RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 16:01:01 2011 GigaScience RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 15:59:27 2011 pjacock For 2012, #ISMB in Long Beach, USA, and #ECCB in Basel, Switzerland<br />
Tue Jul 19 15:59:21 2011 keesvanbochove RT @timjph: @iGenomics here&#8217;s report I mentioned: 3.8b$ on human genome project; 310k jobs created &amp;amp; ~800b$ economic impact http://j.mp/rbXSbk #ismb<br />
Tue Jul 19 15:59:20 2011 fredebibs RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 15:55:20 2011 pjacock RT @gthorisson: Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 15:54:42 2011 keesvanbochove Closing remark of Michael Ashburner, a godfather of #bioinformatics, at #ismb: don&#8217;t be proprietary about your research, be open.<br />
Tue Jul 19 15:54:34 2011 pjacock Embarrassingly many of the #ISMB / #eccb2011 prize winners haven&#8217;t stayed for the prize announcements in the closing session<br />
Tue Jul 19 15:52:34 2011 _lrr_ Michael Ashburner: &amp;quot;If knowledge is not public it is not science. Period.&amp;quot; #ismb #eccb11 #openscience<br />
Tue Jul 19 15:52:31 2011 gthorisson RT @antidemagogue: MA: Knowledge which is not public is not science, period. #ismb<br />
Tue Jul 19 15:52:26 2011 ajordens RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 15:51:54 2011 gthorisson Fantastic #ISMB keynote from Michael Ashburner. What an inspiration. Key msgs: &#8216;collaborate with people brighter than you&#8217; and &#8216;openness&#8217;<br />
Tue Jul 19 15:49:33 2011 pjacock RT @iGenomics: Michael Ashburner: Chance happens in life. You have to exploit it #ismb<br />
Tue Jul 19 15:49:30 2011 pjacock RT @iGenomics: Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 15:48:53 2011 pjacock #ISMB #killerapp award goes to Syed Asad Rahman for EC-BLAST which searches for similar enzymes based on chemical knowledge<br />
Tue Jul 19 15:48:08 2011 antidemagogue MA: Knowledge which is not public is not science, period. #ismb<br />
Tue Jul 19 15:48:02 2011 iGenomics Michael Ashburner: Chance happens in life. You have to exploit it #ismb<br />
Tue Jul 19 15:46:51 2011 pjacock RT @iGenomics: Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Tue Jul 19 15:46:35 2011 druvus RT @pjacock: Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 15:46:34 2011 antidemagogue MA: Exploit your luck. Collaborate. Communicate research openly. #ismb<br />
Tue Jul 19 15:46:21 2011 iGenomics Michael Ashburner: Try to work with people who are smarter than you are. #ismb<br />
Tue Jul 19 15:45:45 2011 pjacock Michael Ashburner #ISMB tells young scientists to be open in their work. If knowledge is not open, it is not science! #openaccess #opendata<br />
Tue Jul 19 15:45:37 2011 olgabot #ismb &amp;quot;If knowledge is not public it is not science&amp;quot; &#8211; Michael Ashburner<br />
Tue Jul 19 15:45:17 2011 iGenomics Michael Ashburner: Science is public knowledge. If it is not open, it is not science. #ismb<br />
Tue Jul 19 15:44:55 2011 biobrichter #ISMB ashburner talk. informative, brings back the memories of the seq 90&#8242;s, but very ashburner-centric. True:collab with brighter people!<br />
Tue Jul 19 15:44:26 2011 pjacock Michael Ashburner wrapping up his #ISMB keynote, describes career as a random walk. Recommends collaborating with people brighter than you!<br />
&gt;&gt;&gt;</p>
]]></content:encoded>
			<wfw:commentRss>http://bytesizebio.net/index.php/2011/07/22/ismb-2011-tweets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

