Question: UMI with MACS
0
gravatar for Richard
9 months ago by
Richard570
Canada
Richard570 wrote:

Hi folks.

We have some ChIP data with UMIs in the reads. We are planning to duplicate mark using either umi_tools or Picard to mark duplicates in a UMI aware fashion.

I see in the docs that MACS will mark duplicates in its own way: https://github.com/taoliu/MACS/wiki/Advanced%3A-Call-peaks-using-MACS2-subcommands

Ideally we could keep all the sequenced reads in the BAM files and have MACS use the duplicate status as it was set by our duplicate marking tool.

Has anyone tried this sort of thing? Perhaps my understand of how MACS works is incorrect?

chip-seq umi macs • 413 views
ADD COMMENTlink modified 9 months ago by i.sudbery7.0k • written 9 months ago by Richard570

Maybe this will work if I just use the "callpeak" function.

ADD REPLYlink written 9 months ago by Richard570

Probably not as even though callpeak is a separate command it still documents that MACS2 will make its own decision about which reads are duplicates.

ADD REPLYlink written 9 months ago by Richard570

I would produce BAM or BED files with only the reads that should be considered for peak calling and use --keep-dup=all to make MACS use only and exactly those reads.

ADD REPLYlink written 9 months ago by ATpoint29k
1
gravatar for i.sudbery
9 months ago by
i.sudbery7.0k
Sheffield, UK
i.sudbery7.0k wrote:

My understanding is that MACS doesn't mark-duplicates, but rather filters out reads that have been marked as duplicates by a seperate tool (although I may be wrong).

Further, I don't think that MACS has any way to deal with UMI deduplication - that is, if you had four reads that the same place on the genome, with two reads each for two separate UMIs, I'm entirely unclear what MACS would do with that, but its quite likely I think that it will filter out all but one of them.

Finally, UMI-tools unfortunately doesn't have an option to mark duplicates, only deduplicate, or mark each read as to which "group" it comes from.

Thus, I would recommend deduplicating your reads with umi_tools dedup and then feeding these into MACS with --keep-dup=all as recommended by @ATpoint.

ADD COMMENTlink written 9 months ago by i.sudbery7.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 960 users visited in the last hour