Preparing ChIP-seq data for analysis with DESeq2
2
0
Entering edit mode
2.9 years ago
gkunz ▴ 30

I am trying to find some resources that discuss how to best prepare ChIP-seq data to be processed in DESeq2. I am having trouble determining how to best go about preparing a count matrix to be used for generating a DESeqDataSet. I have .bam files from and corresponding .narrowPeak files, but I am not sure how to appropriately generate a count matrix from these data or something that can serve as input for DESeq2. I have done some googling and reading, but have been unable to find a clear explanation.

If you are aware or could share any good tutorials out there about how to go about setting up ChIP-seq data for DESeq2 analysis that would be great!

Any assistance is appreciated!

Thanks!

DESeq2 ChIP-seq • 3.2k views
ADD COMMENT
0
Entering edit mode

Harvard-Chan bioinformatics core has ChIP-seq data analysis tutorials. Look under lessons for detailed training materials.

ADD REPLY
1
Entering edit mode

A fantastic resource for sure, but I don't see anywhere in their lessons where they address this question. Could you point out where they do so?

They utilize diffbind for the identification of differential peaks. I am not looking to utilize diffbind.

ADD REPLY
0
Entering edit mode
2.9 years ago
ATpoint 81k

What you need to make a count matrix is a set of reference regions, this could be the merge of all called peaks. For an extensive discussion I recommend to read the extensive manuals of both the Bioconductor packages csaw (which suggests a window-based approach) and DiffBind. There are many threads on making a count matrix from a reference peak set e.g. Best practice for analysing ATAC-seq data

By the way, it is good practice to indicate crossposts, e.g. the one over at Bioconductor which is a forum for technical help with the Bioc packages rather than a platform for general advise.

ADD COMMENT
0
Entering edit mode

Thanks for the response!

I visited the post you have linked and with go about attempting this method!

I have read the the csaw and DiffBind package vignettes in fair detail and run differential analysis utilizing both. As far as I am aware (and more than happy to be wrong) neither explicitly require the data to be formatted in this manner. Unless the data generate by the dba.count function could be passed directly into DESeq2? if that is the case maybe I will try that as well. Peak-based and window-based analyses have yielded extremely different results when I have used then to analyze my data set. The hope is to utilize DESeq2 alone, independent of the DiffBind wrapper to maybe add some clarity to the outputs.

Is there an appropriate was to go about indicating cross-posts, like simply including a link? I am more than happy to do so in the future. Additionally, is it inappropriate to post a broader question like this to the DiffBind forum?

Again, thanks for the input!

ADD REPLY
0
Entering edit mode

Hey, I am trying similar analysis and end up here while looking for informations. I was wondering if you finally found a solution and how it went.

This is how I proceed : To create the matrix, I first define the mapping sites I am interested in : I used a bed file of region +-1500 around TSS from USCS database. Then I used the samToBed and bedmap tools from BedOps to count the numbers of reads falling into each define region and got bed files containing the number of reads associated to each TSS. A little bit of R to merge all files and I got my count matrix that I can use as input for DESeq2. here is a link to Bedops : https://bedops.readthedocs.io/en/latest/index.html

Hope it can help! Best!

ADD REPLY
0
Entering edit mode
2.7 years ago
Rory Stark ★ 2.0k

In DiffBind, you can generate a consensus peak set and count matrix using the dba.count() function. If you want to retrieve the raw counts, you should set score=DBA_SCORE_READS, and then retrieve the matrix using dba.peakset()with bRetrieve=TRUE.

If you run a full DiffBind analysis, you can retrieve a well-formed DESeq2 object by calling dba.analyze() with bRetrieveAnalysis=TRUE (the default method is DESeq2).

ADD COMMENT

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6