Question: Finding common peaks between F-seq peak region files
1
gravatar for a.rex
11 months ago by
a.rex190
a.rex190 wrote:

I have 3 biological ATAC-seq conditions, each with two replicates.

I have obtained F-seq region peak files for each of the six sample.

These are the peak number metrics:

sample 1 condition1  = 260388
sample 2 condition1  = 259940
sample 1 condition2 = 292697
sample 2 condition2 = 290048
sample 1 condition3 = 284690
sample 2 condition3 = 303684

Is there a way in which I can easily compare the similarities and differences in these called peak regions? As in produce a Venn diagram for common regions and uncommon ones? Obviously there may also be two regions but that overlap - what do you do in this case?

atac fseq • 274 views
ADD COMMENTlink modified 11 months ago by Alex Reynolds29k • written 11 months ago by a.rex190
2
gravatar for Alex Reynolds
11 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

If your peak files are in BED format, you can use BEDOPS to do set operations to count overlaps between peaks:

$ bedops --not-element-of 100% A.bed B.bed > elements_unique_to_A.bed
$ bedops --not-element-of 100% B.bed A.bed > elements_unique_to_B.bed
$ bedops --everything A.bed B.bed | bedops --not-element-of 100% - <(bedops --everything elements_unique_to_A.bed elements_unique_to_B.bed) > elements_unique_to_A_and_B.bed

In a Venn diagam, you can think of these files as representing the disjoint overlaps between sets:

enter image description here

To count these subsets, use wc -l:

$ wc -l elements_unique_to_A.bed
1234

Once you have overlap counts, you would put these counts into a Venn diagram or Eulergrid/UpSetR plot to visualize them.

If you have more than two sets, you would calculate overlaps between all subsets (the "powerset") of combinations of sets, and then count them with wc -l.

Also, if you have more than two sets, you would not want to use a Venn diagram, but instead consider using an Eulergrid-style (UpSetR) plot. This is because use of more than two sets with a Venn diagram can lead to false interpretation of overlaps.

Eulergrid/UpSetR plots deal with this problem by showing overlaps as visually-distinct and proportionally-correct elements, and offering ways to sort or organize those elements that highlights certain subset overlaps.

ADD COMMENTlink modified 11 months ago • written 11 months ago by Alex Reynolds29k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 995 users visited in the last hour