Question: ChIp-seq RNA-seq overlap
1
gravatar for Federico
2.3 years ago by
Federico10
Federico10 wrote:

Hello! I am actually trying to overlap my peaks from ChIp-seq to my differentially expressed genes obtained after RNA-seq analysis. I will try to be more clear about it... I have my ChiP-seq peaks for my protein of interest X. I used HOMER to annotate them. Let's say I obtained that 30% of my peaks are enriched at promoter regions. Then I have my RNA-seq data in conditions wt versus X-knockout. I used DESeq2 package from R to obtain a list of differentially expressed genes. My question is now to see whether my ChIp-seq peaks for promoters overlap with my list of differentially expressed genes, i.e. my protein X is effectively binding and regulating the expression of these genes. I would like to know if there is some tool able to allow this also at a statistical level. Of course, even a tool to directly overlap ChIp-seq data with RNA-seq would be great :)

Does anyone have a suggestion for that?

Thank you!

rna-seq chip-seq • 1.8k views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Federico10
3
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan87k
Freiburg, Germany
Devon Ryan87k wrote:

You can use a Fisher's test (fisher.test() in R) for the statistics.

Regarding "overlapping" data, it depends on what you mean. I would personally make a combined heatmap of the ChIP and RNAseq data (at least for the DE genes). You can use deepTools for this, though it'd be easiest if you used the develop branch from github, since the computeMatrixOperations command won't otherwise be available until the next release (ETA November 1). The general steps would be:

  1. Use bamCoverage to generate bigWig files (possibly input-normalized in the case of ChIPseq)
  2. Use computeMatrix on the ChIPseq bigWig files, likely with reference-point and a reasonable setting for -b
  3. Use computeMatrix scale-regions on the RNAseq bigWig files, likely using the --metagene option.
  4. Use computeMatrixOperations cbind with the output of 2 and 3
  5. Make a heatmap with plotHeatmap.

This allows you to see the differences even in cases where there happened to not be a peak called.

ADD COMMENTlink written 2.3 years ago by Devon Ryan87k

Sorry for the naive question, but when doing computeMatrix on the ChIPseq file, what do you use for -R. I keep getting this error:

computeMatrix scale-regions: error: argument --regionsFileName/-R is required

I'm guessing it wants the bed file with the peaks - but isn't the point of this approach to not use the peaks as that is limiting?

Thanks for your help.

ADD REPLYlink written 6 months ago by V100

We usually use transcripts.

ADD REPLYlink written 6 months ago by Devon Ryan87k

by that you mean like a GTF file you use for RNAseq analysis?

ADD REPLYlink written 6 months ago by V100

GTF or BED, yes

ADD REPLYlink written 6 months ago by Devon Ryan87k

Hello Devon. I followed the steps from 1-3. Now, I'm stuck at step 4. Below is the command I ran :

computeMatrixOperations cbind -m peak_sorted_matrix rna_16hr_sorted_matrix -o output.mat.gz

Error :
Traceback (most recent call last):
  File "/home/anupriya/.local/bin/computeMatrixOperations", line 11, in <module>
    main(args)
  File "/home/anupriya/.local/lib/python2.7/site-packages/deeptools/computeMatrixOperations.py", line 677, in main
    cbindMatrices(hm, args)
  File "/home/anupriya/.local/lib/python2.7/site-packages/deeptools/computeMatrixOperations.py", line 408, in cbindMatrices
    hm.matrix.matrix = np.hstack((hm.matrix.matrix, np.empty(hm2.matrix.matrix.shape)))
  File "/home/anupriya/miniconda2/lib/python2.7/site-packages/numpy/core/shape_base.py", line 288, in hstack
    return _nx.concatenate(arrs, 1)
ValueError: all the input array dimensions except for the concatenation axis must match exactly

I used same '-a' and '-b' options in computeMatrix command for ChIP-seq and RNA-seq, still got this error. How can I fix this?

ADD REPLYlink written 20 days ago by anu014160
1

It appears you used a different GTF or BED file to produce the two matrices. Can you post the commands you used to create both?

ADD REPLYlink written 18 days ago by Devon Ryan87k

Hi Devon , below are the commands and the bed files I used :

computeMatrix scale-regions -S rna_16hr_sorted.bw -R m.sme_exons.bed --metagene -a 500 -b 500 --outFileName rna_16hr_sorted_matrix


computeMatrix reference-point -S peak_sorted.bw -R m.sme_transcript.bed -a 500 -b 500 --outFileName peak_sorted_matrix



head -n 20 m.sme_transcript.bed

Chromosome  499 1692
Chromosome  1721    2614
Chromosome  2624    3778
Chromosome  3775    4359
Chromosome  4591    6618
Chromosome  6648    9176
Chromosome  9229    10011
Chromosome  10184   10276
Chromosome  10411   11211
Chromosome  11215   12246
Chromosome  12243   13301
Chromosome  13310   14140
Chromosome  14130   15089
Chromosome  15286   15522
Chromosome  15525   17252
Chromosome  17249   19018
Chromosome  19052   41623
Chromosome  41688   42635
Chromosome  42943   43353
Chromosome  43365   44687


head -n 20 m.sme_exons.bed

Chromosome  499 1692
Chromosome  1721    2614
Chromosome  2624    3778
Chromosome  3775    4359
Chromosome  4591    6618
Chromosome  6648    9176
Chromosome  9229    10011
Chromosome  10072   10148
Chromosome  10184   10276
Chromosome  10293   10368
Chromosome  10411   11211
Chromosome  11215   12246
Chromosome  12243   13301
Chromosome  13310   14140
Chromosome  14130   15089
Chromosome  15286   15522
Chromosome  15525   17252
Chromosome  17249   19018
Chromosome  19052   41623
Chromosome  41688   42635
ADD REPLYlink modified 13 days ago • written 13 days ago by anu014160
1

You'll have to ensure that you do the following:

  1. Both BED files need to be of the same length and sorted such that row N in each file correspond to each other (computeMatrixOperations is just merging them by rows, since it has no way to otherwise determine which rows belong together).
  2. Ensure that computeMatrix is keeping the input file order (--sortRegions keep).
ADD REPLYlink written 12 days ago by Devon Ryan87k

Hi Devon,

But what if I have different/extra row in exon bed file (like this one : Chromosome 10072 10148)? Should I discard them? Won't I'll be losing data then?

FYI, I am using exon file with RNA-seq data and transcript file with Chip-seq data.

ADD REPLYlink modified 9 days ago • written 9 days ago by anu014160

It's unclear what should be matched together if you have extra rows. In that case you must necessarily lose data (not that a few rows matter).

ADD REPLYlink written 9 days ago by Devon Ryan87k

Hi Devon, I took common rows between exon.bed and transcripts.bed and ran remaining commands :

computeMatrixOperations cbind -m peaks_sorted_matrix1 rna_16hr_sorted_matrix1 -o output1.mat.gz 

plotProfile --matrixFile output1.mat.gz --outFileName trial.pdf --samplesLabel peaks rna_16hrs --startLabel GS --endLabel GE --dpi 500 --perGroup --plotHeight 15

and got this plot. Why peaks are not covering the whole plot , did I miss something? https://www.dropbox.com/s/ls6742nyqblt8k9/trial.pdf?dl=0

ADD REPLYlink written 6 days ago by anu014160

They don't cover the whole plot because the two datasets are different size. In general using --perGroup with a dataset like that doesn't make sense, as the columns of data for each sample aren't comparable (only the rows are).

ADD REPLYlink written 6 days ago by Devon Ryan87k

Hi Devon, I was wondering, will using either reference-point or scale-regions in computeMatrix for both chipseq n rnaseq data will work?

ADD REPLYlink written 5 days ago by anu014160

Or I'll remove --perGroup option and create the graph like this : https://www.dropbox.com/s/gw0rc9jgsq09e69/trial1.pdf?dl=0 and then it can be compared?

ADD REPLYlink modified 5 days ago • written 5 days ago by anu014160
1

Yes, though it looks like you have an older version of deepTools, since I think I fixed the issue with the tick labels not being correct in more recent versions.

ADD REPLYlink written 5 days ago by Devon Ryan87k

Thanks a lot Devon for solving the problem!

ADD REPLYlink written 2 days ago by anu014160
0
gravatar for Mike
2.3 years ago by
Mike1.1k
UK
Mike1.1k wrote:

Have a look on BETA tool.

Target analysis by integration of transcriptome and ChIP-seq data with BETA

http://www.nature.com/nprot/journal/v8/n12/full/nprot.2013.150.html

ADD COMMENTlink written 2.3 years ago by Mike1.1k
0
gravatar for Federico
2.3 years ago by
Federico10
Federico10 wrote:

ok thank you! I'll try them out! Cheers

ADD COMMENTlink written 2.3 years ago by Federico10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1115 users visited in the last hour