Question: How To Deal With Redundant Read For Chip-Exo?
3
gravatar for Hanfei Sun
2.2 years ago by
Hanfei Sun40
Boston
Hanfei Sun40 wrote:

The main idea of this technology is introduced in this paper:

Comprehensive Genome-wide Protein-DNA Interactions Detected at Single-Nucleotide Resolution

It seems can give better result than ChIP-seq.

However, there are much fewer regions that a tag can map to, which leads to a lot of redundant reads.

(zoomed out)

(zoomed in)

I don't think only single read should be kept if redundant ones exist.. However, I also think use a maximum number (for example, 5) is also arbitrary. (If there's a distribution model, it would be more reasonable) I saw several threads about how to deal with redundant reads, but didn't find anyone solve it. Does anyone have any idea about that?

ADD COMMENTlink modified 2.2 years ago by Ying W1.8k • written 2.2 years ago by Hanfei Sun40

@samuthing: Are you in a position to suggests modifications to the protocol? In short, you can add a degenerate barcode to the reads in order to combat PCR artifacts/amplicons. I can elaborate in an answer if it would be helpful.

ADD REPLYlink written 2.2 years ago by Steve Lianoglou3.7k

@Steve Lianoglou:It's not about the protocol. Just for peak-calling, when I use MACS, I found it may be not very suitable for ChIP-exo experiment.. I'm working on Computational biology and don't quite understand why adding barcode can combat PCR artifacts. Could you explain a little more?

ADD REPLYlink written 2.2 years ago by Hanfei Sun40
1
gravatar for Istvan Albert
2.2 years ago by
Istvan Albert ♦♦ 39k
University Park, USA
Istvan Albert ♦♦ 39k wrote:

The only reason to remove duplicates in other experiments is that the likelihood of producing identical reads naturally is very low when compared to the rate of PCR artifacts. But this does not mean that one should always automatically remove duplicates.

The method that you cite allows for a far more accurate identification of the binding sites, therefore most of the duplicates will be natural ones that should not be removed as they indicate occupancy levels.

ADD COMMENTlink written 2.2 years ago by Istvan Albert ♦♦ 39k
0
gravatar for Ying W
2.1 years ago by
Ying W1.8k
Los Angeles
Ying W1.8k wrote:

You might want to take a look at this paper

"We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme"

ADD COMMENTlink written 2.1 years ago by Ying W1.8k
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 690 users visited in the last hour