Question: Deseq Sizefactors Help
gravatar for Sara
8.5 years ago by
Sara130 wrote:


I try to do the diff analysis by DESeq on two samples coming from two different condition without replicate

I run the script of DESeq and when I give the library size of my sample, the size factor is 1 for both conditions

libsizes <- c( Cond1 =80040653, Cond2 =360265740)

    Cond1 Cond2 
       1    1

I don't understand this estimation?

when I don't give the libsizes, it become

     Cond1 Cond2 
1.7320508 0.5773503

Could you please explain for me this issue?

Thanks in advance Sara

deseq • 13k views
ADD COMMENTlink modified 8.5 years ago by Damian Kao15k • written 8.5 years ago by Sara130

Probably you assigned sizes first and then did the estimate, which may just scale your assigned sizes back to 1. I think the right way is to estimate the factor directly from the 'cds' object without assigning any libsizes.

ADD REPLYlink written 8.5 years ago by Vitis2.3k

Hi Jeremy thanks for your comment could u please tell me what did this sizeFactors function? is it for normalization of data (scale normalization)?

ADD REPLYlink written 8.5 years ago by Sara130
gravatar for Damian Kao
8.4 years ago by
Damian Kao15k
Damian Kao15k wrote:

The author of DESeq wrote a post on SeqAnswers a while back about how the size factors thing work in DESeq. I'll try to find the post, but basically it normalizes the datasets by:

-Take the geometric mean of each condition for a gene and use that as the reference expression data set.

-For each condition, get a list of quotients of each gene expression value to its reference expression.

-The median of each condition quotient list is the normalization factor for that data set.


Here is the post:

ADD COMMENTlink modified 8.4 years ago • written 8.4 years ago by Damian Kao15k
gravatar for Jeremy Leipzig
8.5 years ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

I don't see where in the documentation it says libsizes is a magic word.

Either use the estimateSizeFactors to use your count data as the source of the estimate, or set it manually using sizeFactors.

Now, why are there so many Sara's?

alt text

ADD COMMENTlink written 8.5 years ago by Jeremy Leipzig19k

Because Sara is a very popular name amongst bioinformatician LOL

ADD REPLYlink written 8.4 years ago by Pasta1.3k

could be some OpenID flaking - good catch - merged the Sara-s - There Can Be Only One

ADD REPLYlink written 8.5 years ago by Istvan Albert ♦♦ 83k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 831 users visited in the last hour