Question: Remove Blacklist regions from ChIP-seq Data
2
gravatar for Ulduz
3.8 years ago by
Ulduz20
Istanbul
Ulduz20 wrote:

Hi,

In order to get rid of false positive peaks, I want to filter out blacklist region after alignment with Bowtie... I've search and there are three different bed files for Blacklist regions. So which of them is the best choice for TF ChIP-seq data?

http://www.broadinstitute.org/~anshul/projects/encode/rawdata/blacklists/hg19-blacklist-README.pdf

 

And do you filter them out with intersect from Bedtools?

chip-seq • 3.9k views
ADD COMMENTlink modified 3.8 years ago by Alex Reynolds25k • written 3.8 years ago by Ulduz20

were you able to solve this problem? I need to do the same and will appreciate your feedback,

ADD REPLYlink written 2.2 years ago by DataFanatic100

were you able to filter the blacklisted regions with intersect from Bedtools? In order to use bedops you need to convert the bam file to bed filter and than convert the back to bam, I was able to do the filtering, but unable to convert the bed to bam file using bedtools bedtobam option any feedback would be greatly appreciated. thanks!!

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by DataFanatic100
4
gravatar for Alex Reynolds
3.8 years ago by
Alex Reynolds25k
Seattle, WA USA
Alex Reynolds25k wrote:

Once you have picked your blacklists, one way to filter is with BEDOPS bedops -n:

$ bedops -n -1 all_peaks.bed blacklist_regions.bed > filtered_peaks.bed

The -1 option indicates a minimum threshold of one base of overlap required to apply the filter operation (basically, any overlap).

ADD COMMENTlink written 3.8 years ago by Alex Reynolds25k

I have an input sample with high Kmer content, and no peaks are found in my IP samples (3replicates) when I use this input. On these same IP samples I do find peaks when using a baseline treatment input and also when I call peaks not using an input at all. my input sample has the following Kmer Content:

Sequence Count PValue Obs/Exp Max Max Obs/Exp Position AGGGGGG 13010 0.0 15.102037 1 TCGCGTA 345 1.1974935E-6 11.862445 37 TGGGGGG 25385 0.0 11.1068325 1 CGCCTGA 4925 0.0 9.191745 21 TATGCCG 2045 0.0 8.6882925 42 CGCGTAT 450 2.8213963E-4 8.33926 38 TCTCCCG 7310 0.0 7.170219 16 CGTATGC 2500 0.0 6.691238 40 CTCGTAT 2440 0.0 6.431545 38 ACACGTC 4290 0.0 6.4261694 9 TGCCGTC 3070 0.0 6.3503175 44 CGTCTGA 4360 0.0 6.3229723 12 TCCCGCC 9155 0.0 5.985545 18 CATCGCG 970 2.6135147E-5 5.974066 35 CTCCCGC 9090 0.0 5.728751 17 TCTCGTA 2575 0.0 5.695125 37 GCCGTCT 3315 0.0 5.2648993 45 CCGCCTG 10365 0.0 5.1227307 20 ATGCCGT 3405 0.0 5.120038 43 ACGTCTG 6060 0.0 5.0546627 11

ADD REPLYlink modified 2.1 years ago • written 2.2 years ago by DataFanatic100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1392 users visited in the last hour