how to use log transformation with read count data?
2
1
Entering edit mode
9.5 years ago
M K ▴ 660

Hi All,

I have a data set ~20000 observations, and when I plot the histogram for this data,I found it skewed to the right. I used log transformation, but I got infinity values because I have many values equal to zero. Is there any way to use the log transformation without removing these zero values because it's important in my analysis. Or is there any other to transformation this data.

R • 19k views
ADD COMMENT
2
Entering edit mode

We add 1 to RNA-seq counts for all the transcripts before the log transformation to get rid of the negative values.

ADD REPLY
0
Entering edit mode

so this will not effect the results at all.

ADD REPLY
3
Entering edit mode
9.5 years ago

You should use the regularized log or the Variance stabilizing transformation. The two transformation are imlpement in DESeq2 package. Check the vignette, it's very well made : http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html

rld <- rlog(dds)
vsd <- varianceStabilizingTransformation(dds)

where dds is a DESeq2 object. You can input data from a count matrix or htseq output (again check the vignette)

ADD COMMENT
0
Entering edit mode

When I try rlog or varianceStabilizingTransformation I get the following error:

Error in DESeqDataSet(se, design = design, ignoreRank) : 
  some values in assay are not integers

my input data is RNA Seq count data (a matrix) where some values are 0 and rest are positive integers. Can you guide me why I am getting this error ? Thanks

ADD REPLY
0
Entering edit mode

could you post the different commands you used please?

ADD REPLY
1
Entering edit mode
9.5 years ago
Ram 43k

I'd suggest using a pseudocount. Maybe a value of 0.0001 added to the actual values would make very little diff in log transformation.

ADD COMMENT
4
Entering edit mode

I think 1 is a better (and more common) number to add. the log of 0.0001 is -9 so probably not what you want for you zero counts

ADD REPLY
1
Entering edit mode

I should've thought of that. I agree, 1 is much better!

ADD REPLY
0
Entering edit mode

do you mean by adding small value to the log transformation like the example below

data<- read.table(.........)
tran_data<- log(data+0.0001)
ADD REPLY
0
Entering edit mode
tran_data<- log2(data+1)

because log2(1) is zero.

ADD REPLY

Login before adding your answer.

Traffic: 3023 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6