Question: Small percentage of overlapping chip-seq peaks
0
gravatar for atsalaki
2.6 years ago by
atsalaki10
atsalaki10 wrote:

I have downloaded ENCODE chip-seq peaks for HepG2 cell line with FOXA2(TF). I found this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2794179/#!po=21.4286 i took the chip-seq data for HepG2 in FOXA2 they have in this paper I used venn diagramm to find the overlapping peaks and for my suprise the overlapping peaks weren so much as i expected, could this happen. I expected one cycle inside the other in biggest percentage enter image description here

chip-seq • 1.2k views
ADD COMMENTlink modified 2.6 years ago by igor7.6k • written 2.6 years ago by atsalaki10
2

What are these numbers that you use to overlap the peaks? Is it from one chromosome only? What distance does Venny allow to count a peak as overlapping?

ADD REPLYlink written 2.6 years ago by Ido Tamir5.0k

All the chromosomes , no distance computed venny it just takes the unique entries from the two lists and correlates them to see how they fit.

ADD REPLYlink written 2.6 years ago by atsalaki10

But what are those numbers? What do they represent? Where do they come from? Peaks should be identified by genomic coordinates.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by igor7.6k
1

They used hg18, did you?

ADD REPLYlink written 2.6 years ago by Devon Ryan89k

All the chromosomes , no distance computed venny it just takes the unique entries from the two lists and correlates them to see how they fit.

ADD REPLYlink written 2.6 years ago by atsalaki10
1

In other words, the results are completely meaningless.

ADD REPLYlink written 2.6 years ago by Devon Ryan89k
1
gravatar for dariober
2.6 years ago by
dariober10.0k
WCIP | Glasgow | UK
dariober10.0k wrote:

I don't know about this TF and cell line, but 10% overlap [438 / (1297 + 438 + 2476)] doesn't surprise me much really, it's small but still not unusual, especially since the data come from different labs (right?).

Also, looking at overlap by number of peaks can be misleading since you give equal weight to all peaks, even those that are at the boundary of significance and might be gained or lost depending on sequencing depth, peak calling sensitivity etc.

(I'm still looking for a good way to assess consistency of peaks between two or preferably more replicates)

ADD COMMENTlink written 2.6 years ago by dariober10.0k
1

Isn't this what the IDR package was made for? https://www.encodeproject.org/software/idr/

ADD REPLYlink written 2.6 years ago by fanli.gcb660

Did you find a good way to find consistency between replicates?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by mforoo10
0
gravatar for igor
2.6 years ago by
igor7.6k
United States
igor7.6k wrote:

According to the screenshot, you are using Venny. Venny is used for overlapping discrete values. This is good for things like genes that have specific names. Peaks are usually identified by genomic coordinates and they span regions of different size. If you overlap ranges with Venny, you will not get an overlap unless the regions are identical. If the two peaks are off by even 1 base, Venny will not consider them overlapping.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by igor7.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 683 users visited in the last hour