Question: RNA-seq DESeq2 : p-values and venn plots in same analysis
0
gravatar for BioHazzard
3 months ago by
BioHazzard0
European Union
BioHazzard0 wrote:

I am doing differential expression analysis. I am comparing two different experiments, each experiment consisting of two treatments and their respective controls in duplicate.

I used DESeq2 to generate a distinct results object for each of the 4 control/treatment pairs and am doing downstream analysis on genes with the adjusted p-value below 0.01.

My question regards the difference between considering genes differentially expressed based on the p-value, which is continuous and comparing the result with a heatmap. Again, p-value thresholds are taken from the DESeq2 results object generated for each of the conditions.

I will illustrate this with two images. These images take into consideration only two of the 4 conditions.

The venn diagram looks like this:

enter image description here

So in each condition a certain number of genes were differentially expressed and the overlaps between the two conditions are shown. In this example, in condition A there are 275 genes that are only differentially expressed in that condition.

However, when I create a heatmap of those genes, which should be exclusively differentially expressed in condition A, I observe that there is also an obvious difference in condition B, even if less strong. Note that the columns in the heatmap are ordered:

        CTR  CTR  CTR CTR TREAT TREAT TREAT TREAT
         A    A    B   B    A     A     B     B

enter image description here

The heatmap tells a different story than the venn diagram. While simply using the p threshold I can define genes as being uniquely differentially expressed in one condition only, the heatmap makes conditions A and B look much more similar, as also shown by the clustering.

Any tips or insight would be greatly appreciated.

heatmap rna-seq venn deseq2 • 213 views
ADD COMMENTlink modified 3 months ago by devbt1510 • written 3 months ago by BioHazzard0

Hi, I dont know if its the hospital firewall, but the images are not visible to me.

ADD REPLYlink written 3 months ago by caggtaagtat290

Can you recommend me an image hosting service that you know you can see?

ADD REPLYlink written 3 months ago by BioHazzard0

No, I think I have to apply for a change in my firewall. Just wanted make sure its because of me

How did you create your DESeqDataSet ?

ADD REPLYlink written 3 months ago by caggtaagtat290
0
gravatar for devbt15
3 months ago by
devbt1510
devbt1510 wrote:

As we cannot see the sample names it is hard to comment but I would presume that the order is same as in the text provided above. As such a heat map command would portray the absolute values from DESeq2 and here we can see that it is similar in both A and B in control and treatment respectively (It will not consider the p-value while doing so, which you considered on the other hand while calculating your DEGs in the Venn). This would mean that there is narrow expression difference between A and B samples (treatment vs. control). I would suggest you plot log2Fold change (treatment vs. control) for A and B (so containing 2 columns only), to see a better difference and also scale the data before plotting (so the heatmap scale goes from -1 to +1). Regards.

ADD COMMENTlink written 3 months ago by devbt1510

Thanks for the reply. My question is less about how to graph the data but more about how to interpret the data.

If I am trying to make statements such as "275 genes showed modified expression only in condition A", they the venn diagram could lead me to make such a statement. However, I am reluctant to make such a statement because when I look at the heatmap, it tells me that those genes are clearly also regulated. Evidently, they are regulated, but the magnitude of differential expression is smaller, so that their p-values are above the threshold.

It looks to me as if venn diagrams are not very good for DE analysis.

Is there another way to analyse similarity between the conditions?

ADD REPLYlink written 3 months ago by BioHazzard0

Can you maybe group you data in CTR_A, CTR_B, TREAT_A, TREAT_B and than do DGE between condition CTR_A and CTR_B using one of them as the reference? Or are the experiments to different?

ADD REPLYlink written 3 months ago by caggtaagtat290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 766 users visited in the last hour