bedops returning NaN occupancies
1
0
Entering edit mode
4.3 years ago

I have a bed file that is in the following format:

chr21   31772184        31772284        U30     0       +
chr7    144623510       144623610       U34     0       -
chr4    39124140        39124240        U39     0       -
chr5    96409745        96409845        U32     0       -


It is a genome wide occupancy file of nucleosomes in the ESC hg19 genome. when I perform the following operation on them:

bedops --merge SRR1781834.bed | bedops --chop 10000 - | bedmap --echo --mean --delim '\t' - SRR1781834.bed > SRR1781834_windows_10000.bed


I get many NaNs...

chr21   31772184        31772284        0.000000
chr7    144623510       144623610       0.000000
chr4    39124140        39124240        NAN
chr5    96409745        96409845        NAN
chr9    133998815       133998915       0.000000


I have performed this on many other bed files of the same type but with no NA values.

Can anyone explain why this might be happening?

ChIP-Seq bedops NaN • 1.2k views
1
Entering edit mode
4.3 years ago

Your example elements are not sorted, at least. Sort your BED file(s) with sort-bed prior to using it with bedops and bedmap. You only need to sort a file once, as downstream tools read and write sorted data.

You may still get NaNs where there are no overlaps between reference and map files (probably not in this specific case, since you are mapping on merged regions, which pretty much guarantees overlaps), but generally you can fix this with adding --skip-unmapped as an option in bedmap.

If you don't want to skip unmapped elements, you can pipe the result to sed to replace NaN values with some other appropriate value (like 0) — but doing that would depend on your signal and whether that is appropriate or not.