Question: DESEQ analysis contain rlog formation?
gravatar for soojinima
4 months ago by
soojinima10 wrote:

Hi. I am a 'R' starter. And recently, I analysed GEO database about my research fields. I have a question about analytic method. The analytic method of DEseq2 contain rlog formation? To reduce the amount of heteroskedasticity, many analytic methods contain several means to shrink the variance of low read count. So people convert raw data by rlog formation instead of log2 formation. I'm curious about the function of DEseq2 contain this rlog formation or I should convert these DEseq 's output data to rlog Data? Thank you. I look forward to your precious reply.

raw counts rna-seq deseq2 rlog • 299 views
ADD COMMENTlink written 4 months ago by soojinima10

DEseq2 has a separate rlog function which you can use to transform your count data. You can supply it a matrix of counts or a deseq2 dataset. The default DESeq function will not give you rlog transformed data.

ADD REPLYlink modified 4 months ago • written 4 months ago by kautilya360

Thank you for your kindly reply. It's a basic concept, but I do not know well, so can I ask you a question? If the default DESeq function will not give me rlog transformed data, and the data used in the deseq analysis are not rlog-transformed, so the heteroskedasticity is high, can I interpret the results of log2FlodChange of DEseq data as it is? (for example, A gene was increased by 2 times as compared to B gene in specific condition)

If so, is the rlog transformation just offered not as differential expression estimation but as separate functionality which can be used for visualization, clustering?

ADD REPLYlink modified 4 months ago • written 4 months ago by soojinima10

In order to normalise data, DESeq2 estimates 2 things:

  • size factors, which help to deal with differences in library sizes across samples
  • Dispersion parameters, which help to deal with heteroskedasticity, amongst other things

DESeq2 models raw RNA-seq counts as a negative binomial distribution and it is through this model that it derives P values and other statistics via the Wald test (applied to the model for each gene). During this process, log base 2 fold-changes are also 'shrunk' in order to deal with biased fold-change differences that can be observed when comparing low-count transcripts.

Statistics are not derived from the regularised log transformation. This transformation is mainly introduced for downstream plotting functions, like heatmaps, etc.

ADD REPLYlink written 4 months ago by Kevin Blighe23k

Thank you very much. I was misunderstanding about rlog transformation. "not for DEG but for visualization" Thank you...

ADD REPLYlink written 4 months ago by soojinima10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 928 users visited in the last hour