Hi all,
I have a BAM file of an alignment, and I need to remove the reads that are fully contained within certain intervals, provided as a GFF file (could be converted to BED).
I know bedtools intersect has the -v option and supports BAM files; however, running the command like this produces the result that is clearly wrong (I just looked at it in the IGV):
bedtools intersect -v -f 1.0 -a old.bam -b int.bed > new.bam
Can someone explain how does -f option work here, and what would be a good way to achieve what I'm trying to do?
I can, of course, just do a direct overlap with -wao style, BED-formatted output, then get the list of read names which overlap the intervals by exactly a read length, and then remove the reads from the BAM file by their name list. But this does seem cumbersome.
Thank you, as always.
On paper, the command you used looks correct to me.
-v -f 1means that you select againstold.bamreads fully mapped onint.bedintervals.What exactly is wrong in IGV ?
Basically wrong reads are removed. You can see which ones overlapped the intervals, and that's not what was removed at all.
Does not answer your question but
samtools viewhas--output-unselectedoption that might be of use http://www.htslib.org/doc/samtools-view.htmlMaybe you can try pybedtools.