ChIPseeker how to read multiple peak files (Compare Multiple peak file)
1
0
Entering edit mode
7.5 years ago

Hi,

My question is more naive. I am using ChIPseeker to compare multiple Peak files. How to read them all and store in "files" object.

files <- getSampleFiles()

Let say I have two peak files

/myFolder/peak1.bed

/myFolder/peak2.bed

How to read and calculate tagMatrix for my multiple ChIP-seq files.. I have sam, bam, bed format aligned files as well. How can I come to tagMatrixList from the files that I have to come to this point:

tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)

Could you please show me the beginning codes/steps.

many thanks

thanks

ChIP-Seq ChIPseeker • 9.9k views
ADD COMMENT
0
Entering edit mode

Do these peak files come from different samples (cell lines or subjects) or are they from different TF or histone marks?

Have you tried looking at the ChIPseeker vignette here - it is quite informative

If you give more clarification on the two samples, I am happy to help with starter code

ADD REPLY
0
Entering edit mode

@ apnri; Many Thanks. They are two different ChIPs, one is histone another is TF. Well, my question is actually one and very simple one and more naive one: "how to read my own peak file from hard drive" (not the package peak file) to follow the vignette example analysis. The vignette is enough detailed. Just got confused in one place where it says that "tagMatrix" is precomputed (to save time). I though it is may be reading aligned BED file (in addition to or) instead of peak bed file some where else and making any Tag Matrix. So, may be this is not the case, all is coming from just peak bed file. So, the started code is needed. So, would you show how to "getTagMatrix" for all the Peaks files I have? That is tagMatrixList <- getTagMatrix(peak, windows=promoter) # but for all peaks. I am little naive in R. peak <- readPeakFile(files) # does not work for all peak files at a time. Thanks.

ADD REPLY
0
Entering edit mode

@ apnri: Same cell Line but different TF. So, the started code would be helpful if you please provide So, would you please show how to "getTagMatrix" for all the Peaks files I have? That is tagMatrixList <- getTagMatrix(peak, windows=promoter) # but for all peaks. I am little naive in R. peak <- readPeakFile(files) # does not work for all peak files at a time. Thanks.

ADD REPLY
0
Entering edit mode

I am not sure I follow what the final goal is when you say tagMatrix for all peaks. You can extend the promoter region farther to get the Tag binding profiles to larger regions. What is it that you want to view with these files?

Here a starter for ChIPseeker -- most of it from the vignette linked above.

##load packages and get annotations
library(ChIPseeker)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

histone.fn <- "histone.bed" # histone peak file name
tf.fn <- "tf.bed" # TF peak file name
histone <- readPeakFile(histone.fn)
tf <- readPeakFile(tf.fn)

# create tagmatrix
promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)
histone.tagMatrix <- getTagMatrix(histone, windows=promoter)
tf.tagMatrix <- getTagMatrix(tf, windows=promoter)
ADD REPLY
0
Entering edit mode

You should also check this out if that is helpful -- https://github.com/shenlab-sinai/ngsplot

ADD REPLY
0
Entering edit mode

Many Thanks. That is also interesting.

ADD REPLY
0
Entering edit mode

@ apnri:

Thanks. I think I was not clear enough. What I mean is to "Compare Multiple peak file" is the actual goal. (7 ChIP peak data set comparison)

So, for that one has to read all files AT A TIME, not one by one. One by one I can do easily. But how altogether. Like one already shown by Guangchuang Yu:

files <- list(peak1 = "/myFolder/peak1.bed", peak2 = "/myFolder/peak2.bed")

Now, how do you make a single "tagMatrix" object for multiple TF peak file? For example: how to come to following points when you have two TF peak files (peak1.bed, peak2.bed) :

tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)

plotAvgProf(tagMatrixList, xlim=c(-3000, 3000))

In the vignette it said the to do the above, you can load the precomputed system file:

data("tagMatrixList")

I don't want to load system example files. I need my files to be red in "

tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)"

But how do you do that with will all peak TF files that you have (peak1.bed, peak2.bed) at a time?

Thanks again.

ADD REPLY
0
Entering edit mode
files <- list(peak1 = "/myFolder/peak1.bed", peak2 = "/myFolder/peak2.bed")

## this is your tagMatrixList, calculated from your input files (peak1.bed and peak2.bed)
tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)

I can't see any problem here.

ADD REPLY
0
Entering edit mode

Hi, that works. Thanks

ADD REPLY
0
Entering edit mode

Hi, What is the replacement for "5'-UTR or 3'UTR or CDS or Exons or Introns" for promoter in the following code (hg19)?

promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)

I mean, instead of "getPromoters" Can I use "get5'UTR" ?

Thanks in advance. Best regards, Jinesh.

ADD REPLY
3
Entering edit mode
7.5 years ago
Guangchuang Yu ★ 2.6k

If you print out the files, you will get:

> files <- getSampleFiles()
> files
$ARmo_0M
[1] "/Library/R/library/ChIPseeker/extdata/GEO_sample_data/GSM1174480_ARmo_0M_peaks.bed.gz"

$ARmo_1nM
[1] "/Library/R/library/ChIPseeker/extdata/GEO_sample_data/GSM1174481_ARmo_1nM_peaks.bed.gz"

$ARmo_100nM
[1] "/Library/R/library/ChIPseeker/extdata/GEO_sample_data/GSM1174482_ARmo_100nM_peaks.bed.gz"

$CBX6_BF
[1] "/Library/R/library/ChIPseeker/extdata/GEO_sample_data/GSM1295076_CBX6_BF_ChipSeq_mergedReps_peaks.bed.gz"

$CBX7_BF
[1] "/Library/R/library/ChIPseeker/extdata/GEO_sample_data/GSM1295077_CBX7_BF_ChipSeq_mergedReps_peaks.bed.gz"

It's a named list of the input files.

So in your case, just create a named list of your files:

files <- list(peak1 = "/myFolder/peak1.bed", peak2 = "/myFolder/peak2.bed")

For the second question, there is a getTagMatrix function with example code presented in the vignette.

Please go through the vignette carefully before posting your question.

ADD COMMENT
0
Entering edit mode

Many Many Thanks for being so helpful. Well, my question is actually one and very simple one and more naive one (sorry I don't know it): "how to read/list all of my own peak files from hard drive and apply the example codes on all those peaks at a time" (not the package peak files) to follow the vignette example analysis. The vignette is enough detailed. Just got confused in one place where it says that "tagMatrix" is precomputed (to save time). I thought it may be reading aligned BED file (in addition to or) instead of peak bed file some where else and making any Tag Matrix or something. Many Thanks once again. So, now with your code for "ffiles <- list(peak1 = "/myFolder/peak1.bed", peak2 = "/myFolder/peak2.bed")" How to do following: tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)"

ADD REPLY

Login before adding your answer.

Traffic: 1623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6