Displaying posts tagged with


Sequencing the frog that can save lives

TL; DR:  The genome sequence of the North American Wood Frog will tell us a lot about the genetic control of freezing and reanimating whole organisms. My friend and colleague, Dr. Andor Kiss is crowdfunding this project. If you would like to help, please go to experiment.com. You will get acknowledged by name in the paper. To […]

New Links between Bacteria and Cancer

Microbiology and Cancer Cancer and microbiology have been closely linked for over 100 years. Cancer patients are usually immunosuppressed due to chemotherapy, requiring special treatment and conditions to prevent bacterial infection. Bladder cancer is typically treated with inactivated tuberculosis bacteria to induce an inflammatory response which turns against remaining cancer cells, with remarkably effective results.  Also, viruses are […]

Dirty Genomics

Short bioinformatics hacks: merging fastq files

So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need Biopython installed.   #!/usr/bin/env python from Bio import SeqIO import itertools import sys import os # Copyright(C) 2011 Iddo Friedberg # Released under Biopython […]

John Smith’s genome sequenced

Springville University’s Genome Center in collaboration with Prof. I. M. A. Bigschotte from IvyLeague University have announced that the genome of Mr. John Smith from Centertown, USA has been sequenced and is now available online. Dr. James Williams, director of the Center said: “We were running out of things to sequence, but I still had […]

Closing gaps

Geek alert: this post for coders. So you sequenced your genome, reached an optimally small number of contigs, they look sane, and now you would like to see what you need for the finishing stage. Namely, how many gaps you have and what are their sizes. UPDATE: “might just be worth clarifying this is for […]

Bioinformatics Blog Carnival #1

Yes! Why should the evolution people have all the fun with their blog carnival? (After all, it is only a theory.) It’s time for bioinformaticians to show what we are made of, and to have a carnival of our own. Bio::blogs had a good run some time ago. I decided to reconnect what is hopefully […]

Blogosphere catches: Marco Island, finding Ada and blog carnivals

Some interesting events cropped up recently. The Marco Island Advances in Genome Biology and Technology meeting was heavily tweeted and blogged about.  Pacific Biosciences unveiled their third generation sequencer. Ostensibly, it can sequence reads of 20,000 length, but the fraction of actual long reads in a run, and their quality is still a bit hazy. […]

Videos on sequencing

A few cool vids on sequencing. Company infomercials, but still entertaining and informative. Thanks to my student, David Ream, for finding these. Pyrosequencing: Helicos: SOLiD: BASETM nanopore sequencing:

Challenges with Data Quality, Sharing, and Versioning in Next-Generation Sequencing

An fine talk by David Dooling highlighting  some of the false impressions about second generation sequencing. A partial list: Why sequencing quality trump base pair output Why genomes are really probabilities rather than strings Why centralized repositories break down when it comes to second generation sequencing data. Collaborative Software development and versioning has been moving […]

Post-apres-next generation sequencing

<RANT> OK, I don’t like the term “next generation sequencing”. It is a relative term, points at a changing target, and therefore inexact. What is the “this-generation sequencing”? Sanger? 454 used to be “next generation”, but now 454 sequencing went from 100bp to 600bp per read, making it qualitatively different.. so “post-next generation sequencing”? If […]