Question: DESEQ2 size factors change when number of samples change in the design
1
gravatar for ZheFrench
12 weeks ago by
ZheFrench150
France
ZheFrench150 wrote:

For example, when you analyse Untreated condition vs Day1Treatment condition with only these samples in your design you will get some size factors.

If you have also set up a Day6Treatment condition in your design (but you don't use it yet), and want to to do the same comparison Untreated vs Day1Treatment, the size factors will change. (taking into account Day6)

I thought size factors was only relative to library size of its sample , so why do they change dependent on the design file even if you don't use a set of samples.

I was doing analyse Day1Treatment vs Untreated & Day6Treatment vs Untreated with two separate design files. But now I am wondering if it's better to have one design with all the samples, and do the two comparison to get same sizefactors because at the end of the day you finish with different differential genes detected.

deseq2 • 211 views
ADD COMMENTlink modified 4 weeks ago by erwan.scaon590 • written 12 weeks ago by ZheFrench150
4
gravatar for grant.hovhannisyan
12 weeks ago by
grant.hovhannisyan1.3k wrote:

SizeFactors are calculated based on all the samples in your dds object. So if you have 1 large experiment its better to put everything together and then perform comparisons for example using contrasts, rather than making new dds objects with subsets of your original dataset.

ADD COMMENTlink written 12 weeks ago by grant.hovhannisyan1.3k

Agreed. The idea is that you estimate a size factor for each column that best scales the datasets based on a large set of genes that do not change upon conditions. Given that you do not have samples with extreme global changes, it is probably the best to have as many samples in the matrix as possible. This probably produces more robust size factors than with only two or three samples.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by ATpoint11k

Ok I got it , but do you know how is it computed ?

ADD REPLYlink written 12 weeks ago by ZheFrench150
1

It is described in the original DESeq paper https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106, and I think the same method is used in DESeq2.

ADD REPLYlink written 12 weeks ago by grant.hovhannisyan1.3k
1

Yes, it is actually pretty simple but powerful from the concept. Check out StatQuest for a nice explanation.

ADD REPLYlink modified 4 weeks ago • written 12 weeks ago by ATpoint11k
0
gravatar for erwan.scaon
4 weeks ago by
erwan.scaon590
Limoges - CBRS - France
erwan.scaon590 wrote:

So if you have 1 large experiment its better to put everything together and then perform comparisons

It is still true if said large experiment was done on multiple Illumina runs (compute size factor for all samples on all runs) ? Or shall we compute size factor for all samples per Illumina run ?

ADD COMMENTlink written 4 weeks ago by erwan.scaon590

You should put everything together and also ideally add batch effect to your formula design.

ADD REPLYlink written 4 weeks ago by grant.hovhannisyan1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1731 users visited in the last hour