Extracting gene ID of overlapping peaks between different Chip-seqs
0
1
Entering edit mode
5.6 years ago
dthaper ▴ 10

Hello,

I am conduced chip-seq for proteins that are part of a complex.

Peak1 - AR
Peak2 - EZH2
Peak3 - pEZH2

From the chip seeker analysis pipeline PMID:28416945, I was able to get to a point in the pipeline that spits out a venn diagram showing common peaks between the 3 chip-seqs. Venn Diagram

I wanted to extract the ID of the genes in the overlapping regions. This I what I have so far:

files <- list(peak1 = "AR.bed", peak2 = "EZH2.bed", peak3 = "pEZH2.bed")

peakAnnoList <- lapply(files, annotatePeak, TxDb=txdb,
                       tssRegion=c(-5000, 5000), verbose=FALSE)

genes = lapply(peakAnnoList, function(i) as.data.frame(i)$geneId)

vennplot(genes)

Is there another way to export out the "genes" data as a excel or csv? It presents as a "Large list" in the environment. I tried the following command and it didn't work as the lengths of the 3 lists isn't the same.

df4 <- data.frame(as.data.frame(genes))

write.table(df4, "overlap.xls", quote=FALSE, row.names=TRUE, sep="\t")

Thanks in advance for any help!

ChIP-Seq R • 2.0k views
ADD COMMENT
0
Entering edit mode

No need to call data.frame twice, this should do: df4 <- data.frame(genes), please provide example output of head(df4) or str(df4). Just adding ".xls" doesn't make it Excel file, it will still output as text file. I am guessing write.csv(df4, "overlap.csv") should work.

ADD REPLY
0
Entering edit mode

Thanks for the correction on the double call! Unfortunately it gives the same error:

df4 <- data.frame(genes)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 13472, 38622, 39500

I'm unable to assign df4 with the contents of "genes"

ADD REPLY
0
Entering edit mode

I see, we are getting error because the length of sets are different. Maybe try this:

write.table(stack(genes), "overlap.txt")

Or to get list of overlapped genes, try:

Reduce(intersect, genes)

If these are not what you need, please provide expected output.

ADD REPLY

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6