Displaying posts tagged with

“second generation sequencing”

Short bioinformatics hacks: merging fastq files

So you received your mate-paired reads in two different files, and you need to merge them for your assembler. Here is a quick Python script to do that. You will need Biopython installed.   #!/usr/bin/env python from Bio import SeqIO import itertools import sys import os # Copyright(C) 2011 Iddo Friedberg # Released under Biopython […]

Why it’s hard to assemble repetitive DNA regions

So here are EssOh and OhOne assembling a rather frustrating puzzle containing cows. The same 5-6 cow “characters” are repeated, which is a perfect way to illustrate low-complexity DNA sequences, and why they are hard to assemble, especially when the pieces are small, like those you get from some second generation sequencers.

Closing gaps

Geek alert: this post for coders. So you sequenced your genome, reached an optimally small number of contigs, they look sane, and now you would like to see what you need for the finishing stage. Namely, how many gaps you have and what are their sizes. UPDATE: “might just be worth clarifying this is for […]