Question: What is the difference between log2(Mutant/Wildtype) or log2(Wildtype/Mutant) from cuffdiff output?
0
gravatar for bioinforesearchquestions
2.4 years ago by
United States
bioinforesearchquestions270 wrote:

Hello folks,

I have the excel file generated from Cuffdiff output for genes with the following columns

Gene, locus, sample_1, sample_2, status, value_1, value_2, log2(fold_change), test_stat, p_value, q_value, significant

Casp1, chr9:5298516-5307281, MUT, WT, OK, 123.019, 0.671358, -7.51758, 6.17607, 6.57E-10, 6.36E-07, yes

As per the excel file, sample_1 is Mutant and sample_2 is Wildtype. Log2(fold_change) is calculated as log2(sample_2/sample_1) --> log2(0.671358/123.019) is -7.51758 .

I thought it should be log2(final/initial), isn't it?

What is the difference between log2(Mutant/Wildtype) or log2(Wildtype/Mutant)?

How to show the fold change (like 4-fold, 5-fold change) in heatmap?

heatmap rna-seq cuffdiff • 1.1k views
ADD COMMENTlink modified 2.4 years ago by Renesh1.8k • written 2.4 years ago by bioinforesearchquestions270
3
gravatar for Renesh
2.4 years ago by
Renesh1.8k
United States
Renesh1.8k wrote:

In cuffdiff output, value_1 and value_2 is for control and experimental conditions. Cuffdiff calculates fold change as log2(value_2/value_1), meaning how much gene expression changes in experimental condition over control.

Further, it depends on which condition you have given as control and experimental while running cuffdiff. You need to share the command that you used for running cuffdiff.

What is the difference between log2(Mutant/Wildtype) or log2(Wildtype/Mutant)?

In log2(Mutant/Wildtype) , wildtype is value_2 and mutant is value_1. It will calculate expression changes in wildtype over mutant. Reverse is valid for log2(Wildtype/Mutant)

How to show the fold change (like 4-fold, 5-fold change) in heatmap?

You can use the color intensity/gradient scale to show the proportion of fold change in the heatmap. See heatmap.2 package in R.

ADD COMMENTlink written 2.4 years ago by Renesh1.8k

Thanks, Renesh for the explanation.

I just have the excel file with me. Cuffdiff analysis was done by someone else. I believe he/she might have mistakenly given mutant as "CONTROL" and wildtype as "EXPERIMENTAL".

ADD REPLYlink written 2.4 years ago by bioinforesearchquestions270
1

Do not make assumptions as an analyst (unless you have first-hand knowledge of the experimental details). You should verify your suspicion with the person who gave you analysis or people who generated the data.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax78k

Hi genomax, that person moved out of the lab long time back. Only excel file has been provided by that person. I don't have the proper log of the analysis. Thanks for cautioning me about it.

ADD REPLYlink written 2.4 years ago by bioinforesearchquestions270

Hi Genomax,

Does the order of the samples entered in Cuffdiff command impact the results of the differential gene expression?

b'cos Cuffdiff considers Label1 as sample1 and Label2 as sample2. So is it a hidden norm to mention the control always as LABEL1 and experimental/mutant as LABEL2?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by bioinforesearchquestions270

Okay, but be careful with this. You should get the analysis code from that person. If the labeling is done incorrectly, it will completely reverse your results.

ADD REPLYlink written 2.4 years ago by Renesh1.8k

Hi Renesh,

This is the code I have for another analysis for Lymph and spleen together from the person who did the below analysis as well,

cuffdiff -o diff_out_Lymphnodes_Spleen -b mouse_mm10.fa -p 12 -L Mutant_Lymph_Spleen,WT_Lymph_Spleen -u merged_asm/merged.gtf Mutant_Lymph/accepted_hits.bam,Mutant_Spleen/accepted_hits.bam Wildtype_Lymph/accepted_hits.bam,Wildtype_Spleen/accepted_hits.bam

I am currently working on the analysis of Lymph nodes, based on the above labeling I suspect that

cuffdiff -o diff_out_Lymphnodes -b mouse_mm10.fa -p 12 -L Mutant_Lymph,WT_Lymph -u merged_asm/merged.gtf Mutant_Lymph/accepted_hits.bam Wildtype_Lymph/accepted_hits.bam

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by bioinforesearchquestions270

From the code above, you have given mutant as control and wildtype as experimental. If you want to look for genes up-regulated in response to mutant, you need to see negative fold change and vice versa for down-regulated genes.

ADD REPLYlink written 2.4 years ago by Renesh1.8k
1
gravatar for Kevin Blighe
2.4 years ago by
Kevin Blighe54k
Kevin Blighe54k wrote:

If, for GeneX, Sample1's expression is 20 and Sample2's expression is 5, then:

log2(Sample1/Sample2) = 2

We can make the following statement: Sample1 has higher expression than Sample2 for GeneX

log2(Sample2/Sample1) = -2

We can make the following statement: Sample2 has lesser expression than Sample1 for GeneX

Both statements are implying the same thing. You can see, however, that the choice of nominator and denominator is important.

This should not be important for the heatmap. If a gene has higher expression than another in a particular sample, then the heatmap will have a shading representative of the higher level (e.g. it will be red if your colour scheme goes from green-to-black-to-red for low-to-normal-to-high expression). The heatmap function in R will usually scale your data to the Z scale for the purposes of heatmap colour-shading, thus, we are then referring to standard deviations from the mean, as opposed to fold-changes. You can switch off this function of the heatmap and transform the data in your own way using the following commands:

myBreaks <- seq(-3, 3, length.out=101)
heat <- t(scale(t(MyDataMatrix)))
heatmap.2(..., breaks=myBreaks, scale="none")
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Kevin Blighe54k

Thanks, Kevin. In general, won't they consider (final/initial) in wet-lab.

As you mentioned, inorder to get the Z-scale for the heatmap, I log transformed the FPKM values.

> FPKM <-read.table("dataset.csv",sep=",", header=TRUE, row.names=1)

> nrow(FPKM) [1] 21

> FPKM_log10 <- log10(FPKM+1)

> head(FPKM)

           MUT_Lymph      WT_Lymph

Actn1  29.42360  92.09680

Ccr7   61.08610 177.92300

Ctla4  53.45150  10.96750

Dapl1  43.16620 140.23300

> head(FPKM_log10)

            MUT_Lymph      WT_Lymph

Actn1  1.4832106 1.9689348

Ccr7   1.7929944 2.2526662

Ctla4  1.7360098 1.0780034

Dapl1  1.6450900 2.1499362

>Log_data_matrix <- data.matrix(FPKM_log10)

> heatmap.2(Log_data_matrix,scale="row", col=greenred, trace="none",margins = c(5,5),cexRow=0.7,cexCol=0.7,dendrogram='both',Rowv=TRUE,Colv=TRUE,reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean), distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="complete"))

Inorder to just get the fold-change of mutant and wildtype in the heatmap, I believe that I should use the FPKM instead of log-transformed FPKM.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by bioinforesearchquestions270

Yes, I would just use the FPKM values for the heatmap. The heatmap function will scale these itself and they will be transformed into Z scores.

Also take a look at my colleague's answer below.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Kevin Blighe54k

Just use your FPKM data-matrix for the heatmap. Do not use the log-transformed one

ADD REPLYlink written 2.4 years ago by Kevin Blighe54k
1

Thanks, Kevin. Sure, I will use FPKM values alone.

ADD REPLYlink written 2.4 years ago by bioinforesearchquestions270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 713 users visited in the last hour