Filling the Gaps in the Chip-seq data
2
0
Entering edit mode
7.8 years ago

Hello all,

I have a chip-seq data and I realized that there are some gaps. It looks like its going with span 25 but in some parts there is no score available. Why is that?

If I want to fill those gaps with -let's say- 0, how can I do that? ()

This is the sample from the data

chr21   9411475 9411500 1
chr21   9411500 9411525 1
chr21   9411525 9411550 0.96
chr21   9413025 9413050 0.64
chr21   9413050 9413075 1
chr21   9413075 9413100 1
chr21   9413100 9413125 1
chr21   9413125 9413150 1
chr21   9413150 9413175 1
chr21   9413175 9413200 1
chr21   9413200 9413225 1.36
chr21   9413225 9413250 1.36
chr21   9413250 9413275 1
chr21   9413275 9413300 1
chr21   9413300 9413325 1
chr21   9413325 9413350 1
chr21   9413350 9413375 2.44
chr21   9413375 9413400 3
chr21   9413400 9413425 2.64
chr21   9413425 9413450 2
chr21   9413450 9413475 2
chr21   9413475 9413500 2
chr21   9413500 9413525 2
chr21   9413525 9413550 2
chr21   9413550 9413575 0.56    <- from here it skips 400 bp
chr21   9413975 9414000 1.32
chr21   9414000 9414025 2.8

Thanks...

ChIP-Seq Bash • 1.9k views
ADD COMMENT
3
Entering edit mode
7.8 years ago

I think bedtools complement comes in handy here to extract the gaps. Pipe the gaps to awk to add 0 as coverage, then merge sort the peaks and gaps in a single file (not tested):

bedtools complement -i peaks.bed -g chromSize.genome \
| awk '{print $0 "\t0"}' > gaps.bed

sort -k1,1 -k2,2n peaks.bed gaps.bed > peaks_gaps.bed && 
rm gaps.bed
ADD COMMENT
0
Entering edit mode
7.8 years ago

The gaps probably come from regions with zero coverage, but its hard to tell with certitude without knowing where your data comes from.

I don't know an easy way to fill the gaps, although writing a little script to do that should not be too hard. [EDIT : see dariober's answer :) ]

An alternative could be, if you have access to the original alignment files, to use bedtools genomecov with '-bga' option to rebuild the bedgraph file with the zeros.

ADD COMMENT

Login before adding your answer.

Traffic: 1954 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6