Question: Extracting gene ID of overlapping peaks between different Chip-seqs
0
gravatar for dthaper
7 months ago by
dthaper0
dthaper0 wrote:

Hello,

I am conduced chip-seq for proteins that are part of a complex.

Peak1 - AR
Peak2 - EZH2
Peak3 - pEZH2

From the chip seeker analysis pipeline PMID:28416945, I was able to get to a point in the pipeline that spits out a venn diagram showing common peaks between the 3 chip-seqs. Venn Diagram

I wanted to extract the ID of the genes in the overlapping regions. This I what I have so far:

files <- list(peak1 = "AR.bed", peak2 = "EZH2.bed", peak3 = "pEZH2.bed")

peakAnnoList <- lapply(files, annotatePeak, TxDb=txdb,
                       tssRegion=c(-5000, 5000), verbose=FALSE)

genes = lapply(peakAnnoList, function(i) as.data.frame(i)$geneId)

vennplot(genes)

Is there another way to export out the "genes" data as a excel or csv? It presents as a "Large list" in the environment. I tried the following command and it didn't work as the lengths of the 3 lists isn't the same.

df4 <- data.frame(as.data.frame(genes))

write.table(df4, "overlap.xls", quote=FALSE, row.names=TRUE, sep="\t")

Thanks in advance for any help!

chip-seq R • 277 views
ADD COMMENTlink modified 7 months ago by zx87547.1k • written 7 months ago by dthaper0

No need to call data.frame twice, this should do: df4 <- data.frame(genes), please provide example output of head(df4) or str(df4). Just adding ".xls" doesn't make it Excel file, it will still output as text file. I am guessing write.csv(df4, "overlap.csv") should work.

ADD REPLYlink modified 7 months ago • written 7 months ago by zx87547.1k

Thanks for the correction on the double call! Unfortunately it gives the same error:

df4 <- data.frame(genes)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 13472, 38622, 39500

I'm unable to assign df4 with the contents of "genes"

ADD REPLYlink written 7 months ago by dthaper0

I see, we are getting error because the length of sets are different. Maybe try this:

write.table(stack(genes), "overlap.txt")

Or to get list of overlapped genes, try:

Reduce(intersect, genes)

If these are not what you need, please provide expected output.

ADD REPLYlink modified 7 months ago • written 7 months ago by zx87547.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 695 users visited in the last hour