Question: Irreproducible Discovery Rate
1
gravatar for Dataminer
6.4 years ago by
Dataminer2.6k
Netherlands
Dataminer2.6k wrote:

Hi!

I am computing irreproducible discovery rate (see here https://sites.google.com/site/anshulkundaje/projects/idr ) for my two ChIP seq profiles for Pol II on same cell line.

My peaks were called using MACS2 and peaks for both ChIP seq profiles were of width 500-520 bases.

I am using following command to compute IDR:

Rscript batch-consistency-analysis2.r ./encodepeaks_1 ./encodepeaks_2 -1 ./Analysis 0 F p.value

If you have used this IDR package you must be knowing that you get a file which has genomic co-ordinates from two ChIP seq profiles which are overlapping with a IDR value in last column (lower the IDR, better the peak).

My problem is few of the genomic co-ordinate in the resulting file are much bigger, in the range of 20041463 bps. Which is very abnormal because none of my peaks are that broad in any of the profile original profile (given as input, i.e., broadpeak1 & broadpeak2).

Have you ever faced such a problem while using IDR? if yes, how you had solved it? I cannot discard these peaks because they fall on my target gene, but they are just too broad.

Please give your advice or suggestion or solution.

Thank you

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by Dataminer2.6k

The old version of IDR (1.x) sometimes mixed and matched starts/ends from input MACS2 peaks, resulting in these multi-Mb monsters. In addition, it regularly paired peaks that were on different chromosomes!!

Be sure to use the new IDR (2.x), available on GitHub here. None of the problems of the old, except the output fold-change value must be divided by 2 (they add replicate fold-changes, instead of averaging??).

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by apa@stowers400

This version seems far easier to use, but the old idr is not restricted to only two replications

ADD REPLYlink written 2.1 years ago by boczniak767630
1
gravatar for Dataminer
6.4 years ago by
Dataminer2.6k
Netherlands
Dataminer2.6k wrote:

Edited the Command

Rscript batch-consistency-analysis2.r ./broadpeaks_1 ./broadpeaks_2 -1 ./Analysis 0 F p.value

This is was one of the things which needed to be fixed, and be careful of the human genome table you use.

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by Dataminer2.6k

Be sure to mark you question as answered--answering your own question is allowed and encouraged.

ADD REPLYlink written 6.4 years ago by Sean Davis25k

The site won't allow me to mark my answer as accepted :(

ADD REPLYlink written 6.4 years ago by Dataminer2.6k

I guess that's also falls under the "voting on own post" prohibition since it also nets reputation points.

ADD REPLYlink written 6.4 years ago by Istvan Albert ♦♦ 79k

Oops--sorry about the misinformation. I think this could be an exception?

ADD REPLYlink written 6.4 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1218 users visited in the last hour