Question: how to use log transformation with read count data?
1
gravatar for M K
6.1 years ago by
M K510
United States
M K510 wrote:

Hi All,

I have a data set ~20000 observations, and when I plot the histogram for this data,I found it skewed to the right. I used log transformation, but I got infinity values because I have many values equal to zero. Is there any way to use the log transformation without removing these zero values because it's important in my analysis. Or is there any other to transformation this data.

R • 13k views
ADD COMMENTlink modified 6.1 years ago by Nicolas Rosewick9.2k • written 6.1 years ago by M K510
2

We add 1 to RNA-seq counts for all the transcripts before the log transformation to get rid of the negative values.  

ADD REPLYlink modified 6.1 years ago • written 6.1 years ago by Ashutosh Pandey12k

so this will not effect the results at all.

ADD REPLYlink written 6.1 years ago by M K510
3
gravatar for Nicolas Rosewick
6.1 years ago by
Belgium, Brussels
Nicolas Rosewick9.2k wrote:

You should use the regularized log or the Variance stabilizing transformation. The two transformation are imlpement in DESeq2 package. Check the vignette, it's very well made : http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html

rld <- rlog(dds)
vsd <- varianceStabilizingTransformation(dds)

where dds is a DESeq2 object. You can input data from a count matrix or htseq output ( again check the vignette )

 

 

ADD COMMENTlink written 6.1 years ago by Nicolas Rosewick9.2k

When I try rlog or varianceStabilizingTransformation I get the following error:

Error in DESeqDataSet(se, design = design, ignoreRank) : 
  some values in assay are not integers

my input data is RNA Seq count data (a matrix) where some values are 0 and rest are positive integers. Can you guide me why I am getting this error ? Thanks 

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Bioinformatist Newbie250

could you post the different commands you used please ?

ADD REPLYlink written 4.8 years ago by Nicolas Rosewick9.2k
1
gravatar for _r_am
6.1 years ago by
_r_am31k
Baylor College of Medicine, Houston, TX
_r_am31k wrote:

I'd suggest using a pseudocount. Maybe a value of 0.0001 added to the actual values would make very little diff in log transformation.

ADD COMMENTlink written 6.1 years ago by _r_am31k
3

i think 1 is a better (and more common) number to add. the log of 0.0001 is -9 so probably not what you want for you zero counts

ADD REPLYlink written 6.1 years ago by brentp23k
1

I should've thought of that. I agree, 1 is much better!

ADD REPLYlink written 6.1 years ago by _r_am31k

do you mean by adding small value to the log transformation like the example below

data<- read.table(.........)

tran_data<- log(data+0.0001)

ADD REPLYlink written 6.1 years ago by M K510

tran_data<- log2(data+1)

because log2(1) is zero.

ADD REPLYlink written 6.1 years ago by Chirag Nepal2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2256 users visited in the last hour