I have a paired-end BAM file which has gone through numerous filtering steps (MAPQ, MarkDuplicates, Mitochondrial, Unmapped). I would like to extract all properly paired mapped reads and place them in a BEDPE file, and extract all single mapped reads and place them in a BED file. I am sure the numerous filtering steps have created lots of orphans. What would be the correct commands to use to achieve this? I was thinking the following may work:
samtools view -bf 0x2 reads.bam | bedtools bamtobed -bedpe -i stdin > reads.paired.bedpe samtools view -bf 8 -F 260 reads.bam | bedtools bamtobed -i stdin > reads.single.bed
When I run the first command on the query name sorted BAM file, BEDtools warns me that it could not find reads next to each other with the same name. I'm assuming the filtering steps have removed one of the paired-end reads?
I am confused, what exactly does the fixmate command do? What I'm trying to do is extend the reads based on where they have aligned within the genome. I thought I could extract the properly-paired and singles into a BED file and manually extend them. I actually have the unfiltered BAM file produced by Bowtie2, but I am unsure that after extension I would be able to re-apply the filters?