Explanation of Venn diagram output
1
0
Entering edit mode
3.1 years ago
munaj86 ▴ 30

Hi,,

I need an explanation of a Venn diagram output. I have two peaks file resulted from MACS2, the first peak with 1318 peaks and the second one with 973 peaks. I converted the peaks files into Granges as follows:

bed_2 <- read.table("Concatenate_JARID2_peaks.narrowPeak_1.bed",header = TRUE, sep="\t",stringsAsFactors=FALSE, quote="")

peaks_2 <- toGRanges(bed_2, format="narrowPeak")

Then I make Venn diagram using the following command:

makeVennDiagram(list(peaks, peaks_2), NameOfPeaks=c("JARID2", "EZH2"),fill=c("#009E73", "#F0E442"), col=c("#D55E00", "#0072B2"),cat.col=c("#D55E00", "#0072B2"))

In the resulted Venn diagram, I got the total number for the first peak file as 682 and for the second one as 1023. The intersection peaks become as 289 so can anyone explain why I get more number in the second peak and less in the first one. Also, I would appreciate if anyone can check my commands and maybe share an alternative way to make Venn diagram for MACS2 peaks file.

Thanks..

MunaJ

general education • 1.8k views
ADD COMMENT
2
Entering edit mode
3.1 years ago
seidel 11k

You should list what libraries your functions are coming from. I think you're using the makeVennDiagram function from the ChIPpeakAnno library. If so, you might play with the "connectedPeaks" parameter. One thing that can be confusing about counting overlaps of ChIP Seq peaks is when a "peak" consists of two segments in one data set, but a single segment in the other. A way around this is to step back conceptually, and consider creating a common universe of genomic locations from your data sets. This way when you ask if a data set overlaps a given location, you ask the same question from both data sets. You can do this by reducing your data sets to a common universe (a reference set) and then ask what components of each of your data sets overlap that:

 library(GenomicRanges)

# give things meaningful names
jarid2 <- peaks
ezh2 <- peaks2

# create a reference set of genomic locations
refset <- reduce(c(jarid2, ezh2))
# give them unique names
names(refset) <- paste0("p", seq_along(refset))

# tally the locations that overlap with your peaks
jarid2.ol <- names(refset[countOverlaps(refset, jarid2) > 0])
ezh2.ol <- names(refset[countOverlaps(refset, exh2) > 0])

# make a Venn diagram
source("http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/vennDia.R")

qlist <- venndiagram(x=jarid2.ol, y=ezh2.ol, unique=T, title="JARID2 EZH2 overlap", labels=c("JARID2", "EZH2"), plot=TRUE, type=2)

# qlist is a list that contains the names for each Venn set
ADD COMMENT
0
Entering edit mode

Thanks for this, it works successfully. I've tried to add the other peaks file and try to draw Venn diagram for 4 peaks file but that gives me an error:

qlist <- venndiagram(x= list(JARID2.ol, EZH2.ol, H3K4me3.ol, H3K27me3.ol), unique=T, title="Peaks_overlap", labels=c("JARID2", "EZH2", "H3K27me3", "H3K4me3"), plot=TRUE, type=2) Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'unique': promise already under evaluation: recursive default argument reference or earlier problems?

Any advice?

Thanks

ADD REPLY
0
Entering edit mode

Yes, the venndiagram function has a different syntax for more sets than what you're using. If you load the source URL directly into your browser window, you can see the documentation. It doesn't take a list. Rather you would use: x=JARID2.ol, y=EZH2.ol, z=H3K4me3.ol, w=H3K27me3.ol, type=4. Also, the function I pointed to is very old. You can see Thomas Girke pointing to a newer version.

ADD REPLY

Login before adding your answer.

Traffic: 3131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6