Question

Merging Peaks from macs2

0

Entering edit mode

5.6 years ago

dimitrischat ▴ 210

After macs2, you get files such as .xls. I have 6 .xls files (0h,3h,6h each of them have 2 replicates) and i want to merge them so i can use them (and the bam files) into CSAW for Differential Binding Analysis of ChIP-seq peaksets. So my workflow is like this:

I open the xls file, delete first rows ( until chr,start,end,length,pileup,-log10(pvalue),fold_enrichment,-log10(qvalue),name) and Save it as .csv then rename it as .bed. After that I use this command (which I found here: merge chipseq peaks with bedtools/other tool ) :

awk '{print $0"\t","nm6k27_rep2"NR}' nm6k27_rep2_macs2.bed > nm6k27_rep2_new.bed

in that was every peak has next to it a name which shows where it came from.

Then i am trying to merge them by using this :

cat nm0k27_rep1_macs2.bed nm0k27_rep2_macs2.bed \
    nm3k27_rep1_macs2.bed nm3k27_rep2_macs2.bed \
    nm6k27_rep1_macs2.bed nm6k27_rep2_macs2.bed | \
    sort -k1,1 -k2,2n | mergeBed -i stdin \
    > locations.bed

or

cat nm0k27_rep1_macs2.bed nm0k27_rep2_macs2.bed \
    nm3k27_rep1_macs2.bed nm3k27_rep2_macs2.bed \
    nm6k27_rep1_macs2.bed nm6k27_rep2_macs2.bed | \
    sort -k1,1 -k2,2n | mergeBed -i stdin -o collapse -c 4

but both result an error:

ERROR: file stdin has non positional records, which are only valid for the groupBy tool.

Any suggestions?

ChIP-Seq • 6.0k views

ADD COMMENT • link updated 5.6 years ago by Ram 43k • written 5.6 years ago by dimitrischat ▴ 210

1

Entering edit mode

Hello dimitrischat,

please take care of formating your post in a way that makes it more easily, e.g. by using paragraphs.

Please use also the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Furthermore it would be helpful if you show us the first lines of your csv and bed file.

Thank you!

fin swimmer

ADD REPLY • link 5.6 years ago by finswimmer 16k

0

Entering edit mode

Sorry..will do!

chr1    713706  714672  967 10.18   13.67773    8.27315 11.37243    macs2hisat2_peak_1   nm0k27_rep12
chr1    824823  825324  502 3.73    4.20165 3.50376 2.17421 macs2hisat2_peak_2   nm0k27_rep13

i now used the bedtools sort as @Devon Ryan said and then used what @ATpoint said. Now i got a merged.bed file with :

chr1    9968    10562
chr1    51536   51712
chr1    603212  604780
chr1    710221  711759
chr1    712948  714845
chr1    726856  727105

is this correct now ? For me to use it in CSAW?

thanks!

Edit: Personal email address removed by mod

ADD REPLY • link updated 5.6 years ago by finswimmer 16k • written 5.6 years ago by dimitrischat ▴ 210

0

Entering edit mode

Though you can use called peaks with CSAW, they kind of recommend that you don't. It's possible to just feed it the BAM files and still get results, considering it's a window-based method anyway.

ADD REPLY • link 5.6 years ago by jared.andrews07 ★ 16k

0

Entering edit mode

They kind of recommend that you dont? What do they reccomend?

ADD REPLY • link 5.6 years ago by dimitrischat ▴ 210

1

Entering edit mode

Please refer to the manual. It is outstandingly extensive, and given that you are probably rather new in the ChIP-seq field, it provides a lot of knowledge that one should have prior to diving into the analysis. I strongly recommend to spend some quality time on it ;-)

ADD REPLY • link 5.6 years ago by ATpoint 81k

1

Entering edit mode

I recommend reading their user guide, it provides a good breakdown of how the package works and should help you run it properly.

If you really want to use a consensus peakset, you will probably want to use DiffBind instead.

edit: @ninja'd by @ATpoint.

ADD REPLY • link 5.6 years ago by jared.andrews07 ★ 16k

0

Entering edit mode

CSAW works great with consensus peak sets and is far less prone to breaking than DiffBind.

ADD REPLY • link 5.6 years ago by Devon Ryan 104k

0

Entering edit mode

Huh, I had many more issues with csaw than DiffBind, but to each their own.

ADD REPLY • link 5.6 years ago by jared.andrews07 ★ 16k

1

Entering edit mode

First rule of bioinformatics, never use Excel things ;-D There should be narrowPeak files from the peak calling. With these, simply do:

cat *narrowPeak | cut -f1-3 | sort -k1,1 -k2,2n | bedtools merge -i - > merged.bed

ADD REPLY • link 5.6 years ago by ATpoint 81k

0

Entering edit mode

there are broadPeak and gappedPeak along with the excel thingy.. :D which one should i use?

ADD REPLY • link 5.6 years ago by dimitrischat ▴ 210

1

Entering edit mode

I just noticed that you aim to use csaw. The idea of this package is to omit peak calling categorically, and use the slidign window approach. Is there a reason you do it anyway?

EDIT: @jared.andrews07 just asked the same thing.

ADD REPLY • link 5.6 years ago by ATpoint 81k

score 1 · Answer 1 · 2018-09-06

1

Entering edit mode

5.6 years ago

Devon Ryan 104k

You're looking for bedtools sort.

ADD COMMENT • link 5.6 years ago by Devon Ryan 104k