Question: Merging Peaks from macs2
0
gravatar for dimitrischat
8 months ago by
dimitrischat60
dimitrischat60 wrote:

After macs2, you get files such as .xls. I have 6 .xls files (0h,3h,6h each of them have 2 replicates) and i want to merge them so i can use them (and the bam files) into CSAW for Differential Binding Analysis of ChIP-seq peaksets. So my workflow is like this:

I open the xls file, delete first rows ( until chr,start,end,length,pileup,-log10(pvalue),fold_enrichment,-log10(qvalue),name) and Save it as .csv then rename it as .bed. After that I use this command (which I found here: merge chipseq peaks with bedtools/other tool ) :

awk '{print $0"\t","nm6k27_rep2"NR}' nm6k27_rep2_macs2.bed > nm6k27_rep2_new.bed

in that was every peak has next to it a name which shows where it came from.

Then i am trying to merge them by using this :

cat nm0k27_rep1_macs2.bed nm0k27_rep2_macs2.bed \
    nm3k27_rep1_macs2.bed nm3k27_rep2_macs2.bed \
    nm6k27_rep1_macs2.bed nm6k27_rep2_macs2.bed | \
    sort -k1,1 -k2,2n | mergeBed -i stdin \
    > locations.bed

or

cat nm0k27_rep1_macs2.bed nm0k27_rep2_macs2.bed \
    nm3k27_rep1_macs2.bed nm3k27_rep2_macs2.bed \
    nm6k27_rep1_macs2.bed nm6k27_rep2_macs2.bed | \
    sort -k1,1 -k2,2n | mergeBed -i stdin -o collapse -c 4

but both result an error:

ERROR: file stdin has non positional records, which are only valid for the groupBy tool.

Any suggestions?

chip-seq • 575 views
ADD COMMENTlink modified 8 months ago by RamRS21k • written 8 months ago by dimitrischat60
1

Hello dimitrischat,

please take care of formating your post in a way that makes it more easily, e.g. by using paragraphs.

Please use also the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Furthermore it would be helpful if you show us the first lines of your csv and bed file.

Thank you!

fin swimmer

ADD REPLYlink modified 8 months ago • written 8 months ago by finswimmer11k

Sorry..will do!

chr1    713706  714672  967 10.18   13.67773    8.27315 11.37243    macs2hisat2_peak_1   nm0k27_rep12
chr1    824823  825324  502 3.73    4.20165 3.50376 2.17421 macs2hisat2_peak_2   nm0k27_rep13

i now used the bedtools sort as @Devon Ryan said and then used what @ATpoint said. Now i got a merged.bed file with :

chr1    9968    10562
chr1    51536   51712
chr1    603212  604780
chr1    710221  711759
chr1    712948  714845
chr1    726856  727105

is this correct now ? For me to use it in CSAW?

thanks!

Edit: Personal email address removed by mod

ADD REPLYlink modified 8 months ago by finswimmer11k • written 8 months ago by dimitrischat60

Though you can use called peaks with CSAW, they kind of recommend that you don't. It's possible to just feed it the BAM files and still get results, considering it's a window-based method anyway.

ADD REPLYlink written 8 months ago by jared.andrews072.3k

They kind of recommend that you dont? What do they reccomend?

ADD REPLYlink written 8 months ago by dimitrischat60
1

Please refer to the manual. It is outstandingly extensive, and given that you are probably rather new in the ChIP-seq field, it provides a lot of knowledge that one should have prior to diving into the analysis. I strongly recommend to spend some quality time on it ;-)

ADD REPLYlink written 8 months ago by ATpoint16k
1

I recommend reading their user guide, it provides a good breakdown of how the package works and should help you run it properly.

If you really want to use a consensus peakset, you will probably want to use DiffBind instead.

edit: @ninja'd by @ATpoint.

ADD REPLYlink modified 8 months ago • written 8 months ago by jared.andrews072.3k

CSAW works great with consensus peak sets and is far less prone to breaking than DiffBind.

ADD REPLYlink written 8 months ago by Devon Ryan90k

Huh, I had many more issues with csaw than DiffBind, but to each their own.

ADD REPLYlink written 8 months ago by jared.andrews072.3k
1

First rule of bioinformatics, never use Excel things ;-D There should be narrowPeak files from the peak calling. With these, simply do:

cat *narrowPeak | cut -f1-3 | sort -k1,1 -k2,2n | bedtools merge -i - > merged.bed
ADD REPLYlink written 8 months ago by ATpoint16k

there are broadPeak and gappedPeak along with the excel thingy.. :D which one should i use?

ADD REPLYlink written 8 months ago by dimitrischat60
1

I just noticed that you aim to use csaw. The idea of this package is to omit peak calling categorically, and use the slidign window approach. Is there a reason you do it anyway?

EDIT: @jared.andrews07 just asked the same thing.

ADD REPLYlink modified 8 months ago • written 8 months ago by ATpoint16k
1
gravatar for Devon Ryan
8 months ago by
Devon Ryan90k
Freiburg, Germany
Devon Ryan90k wrote:

You're looking for bedtools sort.

ADD COMMENTlink written 8 months ago by Devon Ryan90k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour