Entering edit mode
4.2 years ago
Vasiliy Krestov
▴
30
I have a .sam file with reads mapped to a plasmid and a bacterial chromosome. I used bowtie to map reads and I allowed multi-mapping.
How can I exclude reads which were aligned to both plasmid and chromosome from the .sam file? I want to do this in order to avoid biases caused by different plasmid copy numbers
This would likely need a custom program. You will need to name sort your BAM file and then walk through it to find reads that aligned to both chromosome and plasmid (I assume it was a separate entry in the reference) and drop those reads/lines.
A more crude way would be to isolate columns 1 and 3 and the
sort|uniqthem. Then keep read entries that occur only once.For bowtie it is pretty simple as you can tune alignment parameters to not allow multimappers in the SAM file, see http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-m
Alternatively, multimappers have low MAPQ scores, so filtering based on MAPQ (say 10 and above) should remove multimappers as well, e.g.
samtools view -q 10 -o out.sam in.samI think I can't use it because at the same time I want to retain those multi-mappers which occur only in chromosome/plasmid. I don't want to exclude all multi-mappers :)