Question: DESEQ2 size factors change when number of samples change in the design
1
gravatar for ZheFrench
4 months ago by
ZheFrench190
France
ZheFrench190 wrote:

For example, when you analyse Untreated condition vs Day1Treatment condition with only these samples in your design you will get some size factors.

If you have also set up a Day6Treatment condition in your design (but you don't use it yet), and want to to do the same comparison Untreated vs Day1Treatment, the size factors will change. (taking into account Day6)

I thought size factors was only relative to library size of its sample , so why do they change dependent on the design file even if you don't use a set of samples.

I was doing analyse Day1Treatment vs Untreated & Day6Treatment vs Untreated with two separate design files. But now I am wondering if it's better to have one design with all the samples, and do the two comparison to get same sizefactors because at the end of the day you finish with different differential genes detected.

deseq2 • 268 views
ADD COMMENTlink modified 3 months ago by erwan.scaon630 • written 4 months ago by ZheFrench190
4
gravatar for grant.hovhannisyan
4 months ago by
grant.hovhannisyan1.4k wrote:

SizeFactors are calculated based on all the samples in your dds object. So if you have 1 large experiment its better to put everything together and then perform comparisons for example using contrasts, rather than making new dds objects with subsets of your original dataset.

ADD COMMENTlink written 4 months ago by grant.hovhannisyan1.4k

Agreed. The idea is that you estimate a size factor for each column that best scales the datasets based on a large set of genes that do not change upon conditions. Given that you do not have samples with extreme global changes, it is probably the best to have as many samples in the matrix as possible. This probably produces more robust size factors than with only two or three samples.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint13k

Ok I got it , but do you know how is it computed ?

ADD REPLYlink written 4 months ago by ZheFrench190
1

It is described in the original DESeq paper https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106, and I think the same method is used in DESeq2.

ADD REPLYlink written 4 months ago by grant.hovhannisyan1.4k
1

Yes, it is actually pretty simple but powerful from the concept. Check out StatQuest for a nice explanation.

ADD REPLYlink modified 3 months ago • written 4 months ago by ATpoint13k
0
gravatar for erwan.scaon
3 months ago by
erwan.scaon630
Limoges - CBRS - France
erwan.scaon630 wrote:

So if you have 1 large experiment its better to put everything together and then perform comparisons

It is still true if said large experiment was done on multiple Illumina runs (compute size factor for all samples on all runs) ? Or shall we compute size factor for all samples per Illumina run ?

ADD COMMENTlink written 3 months ago by erwan.scaon630

You should put everything together and also ideally add batch effect to your formula design.

ADD REPLYlink written 3 months ago by grant.hovhannisyan1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 760 users visited in the last hour