Question: Extracting gene ID of overlapping peaks between different Chip-seqs
1
gravatar for dthaper
2.2 years ago by
dthaper10
dthaper10 wrote:

Hello,

I am conduced chip-seq for proteins that are part of a complex.

Peak1 - AR
Peak2 - EZH2
Peak3 - pEZH2

From the chip seeker analysis pipeline PMID:28416945, I was able to get to a point in the pipeline that spits out a venn diagram showing common peaks between the 3 chip-seqs. Venn Diagram

I wanted to extract the ID of the genes in the overlapping regions. This I what I have so far:

files <- list(peak1 = "AR.bed", peak2 = "EZH2.bed", peak3 = "pEZH2.bed")

peakAnnoList <- lapply(files, annotatePeak, TxDb=txdb,
                       tssRegion=c(-5000, 5000), verbose=FALSE)

genes = lapply(peakAnnoList, function(i) as.data.frame(i)$geneId)

vennplot(genes)

Is there another way to export out the "genes" data as a excel or csv? It presents as a "Large list" in the environment. I tried the following command and it didn't work as the lengths of the 3 lists isn't the same.

df4 <- data.frame(as.data.frame(genes))

write.table(df4, "overlap.xls", quote=FALSE, row.names=TRUE, sep="\t")

Thanks in advance for any help!

chip-seq R • 1.1k views
ADD COMMENTlink modified 2.2 years ago by zx87549.7k • written 2.2 years ago by dthaper10

No need to call data.frame twice, this should do: df4 <- data.frame(genes), please provide example output of head(df4) or str(df4). Just adding ".xls" doesn't make it Excel file, it will still output as text file. I am guessing write.csv(df4, "overlap.csv") should work.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by zx87549.7k

Thanks for the correction on the double call! Unfortunately it gives the same error:

df4 <- data.frame(genes)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 13472, 38622, 39500

I'm unable to assign df4 with the contents of "genes"

ADD REPLYlink written 2.2 years ago by dthaper10

I see, we are getting error because the length of sets are different. Maybe try this:

write.table(stack(genes), "overlap.txt")

Or to get list of overlapped genes, try:

Reduce(intersect, genes)

If these are not what you need, please provide expected output.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by zx87549.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2091 users visited in the last hour