Hi, I have one BAM file which contains all alignments (include those not used in variant calling, such as non-PF, non-mapping and duplicate reads) generated for an assembly. How to filter these useless mapping? I know that Picard MarkDuplicates can be used to remove duplicates.Thank you.
You can use samtools to do this. e.g. to remove reads that did not align, you can do:
samtools view -F 0x04 -b in.bam > out.aligned.bam
to only include paired reads, use:
Check the other bitwise flags on this page.
But you might not want to exclude those as they could be used for finding structural variations.
The bamtools package offers a wide range of filters, including user-definable filters defined in JSON notation. It includes filters for reads failing vendor QC, unmapped reads and pre-marked duplicates.