I have a group of paired reads sequenced using Solid 4 (50bp each mate). I discovered that reads are contaminated by E.coli. My strategy is to align the reads against the reference genome and against the genomes of E.coli, and separate the aligned and no-aligned reads, respectively.
My question is: how to better way to get the paired reads, from the SAM file or during alignment? I use Shrimp, that allow to use the parameter --al (aligned reads) and --un (unaligned reads).