Dear Members,
Is there a way I can removes reads associated with a region (chr, start, end) from a .bam file (RNASeq data) prior to the application of HTSeq?
I will greatly appreciate your feedback,
Noushin
Dear Members,
Is there a way I can removes reads associated with a region (chr, start, end) from a .bam file (RNASeq data) prior to the application of HTSeq?
I will greatly appreciate your feedback,
Noushin
bedtools intersect -abam file.bam -b filter.bed -v > filtered.bam
filter.bed should contain
chr start end
You'll want to use NGSUtil's bamutils tool, specifically with -excludebed.
But, id recommend you dont :P
The BAM format is to store highly compressed alignment data. You should treat them like raw, virgin data, without normalization/filtering tweaks here and there to get it into shape.
All that kind of intersection stuff should be done on processed signal data - wigs and bedgraphs, etc - where its much easier to have multiple versions of things and to just dump it all and start afresh from the .bam if you have to.
Having said that, its your data, do what you like with it :)
Just found, there is an option -U in samtools view. Use it like this:
samtools view input.bam -b -h -o output_inRegions.bam -U output_outRegions.bam -L Regions.bed
Using this QC package for RNAseq http://rseqc.sourceforge.net
Split_bam.py would do the splitting of bam files.