Question: Remove reads counts overlapping with blacklist regions in BAM files
4
gravatar for armandorp
3.0 years ago by
armandorp40
armandorp40 wrote:

Hi all,

I am preprocessing some bam files for differential binding analysis. I have read that some people recommend removing blacklist regions (human) before to do the peak calling. Do you have any suggestions how directly extract the read counts in these regions using maybe samtools?

Thanks, Armando

samtools chip-seq • 3.6k views
ADD COMMENTlink modified 10 months ago by Biostar ♦♦ 20 • written 3.0 years ago by armandorp40
1

Use intersectBed with -v param

http://bedtools.readthedocs.org/en/latest/content/tools/intersect.html

ADD REPLYlink written 3.0 years ago by Sukhdeep Singh9.6k
2
gravatar for Devon Ryan
3.0 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Normally you don't need to modify or do anything with the alignments. After calling peaks, exclude regions that overlap a blacklisted region (use bedtools).

Relatedly, deepTools has an option to ignore reads/signal in blacklisted regions as of version 2.2 I think. This won't do what you need in this particular case, but for other ChIP-seq related things this is useful.

ADD COMMENTlink written 3.0 years ago by Devon Ryan88k

Thanks Devon. Do you mean to use deeptools before calling peaks to get a kind of marked BAM file for these regions? So you don´t recommend to remove blacklisted regions in the alignments, but the reads/ high-signal associated with blacklisted regions can affect the peak calling with MACS2.

ADD REPLYlink written 3.0 years ago by armandorp40
1

I mean what I wrote. Don't bother excluding reads before peak calling (yes, you can do this with bamutils or bedtools if you want to wait a while), just exclude the peaks afterward. You might get poorer power in the direct vicinity of blacklisted regions, but that'll be a minor effect given the time savings.

ADD REPLYlink written 3.0 years ago by Devon Ryan88k

Devon, you are not answering the question: can " the reads/ high-signal associated with blacklisted regions can affect the peak calling with MACS2? Also
I tried removing the blacklisted regions using bedops so I had do bam to bed file conversion, after filtering am unable to convert the files back to bam am using bedtools to do the file conversions. can anyone please suggest a way to filter directly from the bam files, bedops does not take bam files and the blacklisted regions are in bed format

ADD REPLYlink written 2.5 years ago by DataFanatic130
1

Blacklisted regions don't normally do much except cause false-positive peaks (inside the blacklisted regions). This isn't always the case, but if you have nice large peaks versus input then it will be. If you have really weak enrichment (for whatever reason) then you might gain something from stripping out reads in blacklisted regions (maybe I should write a little tool for that so it can be multithreaded).

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Devon Ryan88k
1

Use bedtools:

bedtools intersect -v -abam FILE.BAM -b BLACKLIST.BED > FILTERED.BAM
ADD REPLYlink written 2.5 years ago by harold.smith.tarheel4.3k
2
gravatar for paul.e.gradie
2.6 years ago by
Australia/Melbourne/University of Melbourne
paul.e.gradie80 wrote:

Good suggestions.

Keep in mind - when you perform a cross strand correlation analysis to assess peak quality in your data, if you do not exclude blacklisted reads prior to this analysis (which generally uses aligned reads or bed converted intervals from these reads), then you will include these regions in your analysis which may skew the results one way or the other.

Devon is right - the time savings will be nice to call peaks without this step and peak calling shouldn't really be affected much by these regions, depending on which peak caller you decide to use.

Just have a look at a few of the blacklisted regions after your alignment to see what kind of signal you're getting there, and then choose whether to use filtered alignment files or not for whichever downstream analyses (following Sukhdeep Singh's suggestion).

Cheers Paul

ADD COMMENTlink written 2.6 years ago by paul.e.gradie80
2

Good point regarding cross correlation with blacklisted regions included. Someone in our group mention maybe adding a cross correlation tools to deepTools. That'd properly ignore blacklisted regions (though we'd have to think about edge effects). Maybe I'll write something like that.

ADD REPLYlink written 2.6 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1805 users visited in the last hour