Question

Help interpretating DESeq2 output

0

Entering edit mode

3.4 years ago

nanoide ▴ 120

Hi,

I'm currently struggling with some DEseq2 analyses of RNA-seq data. I'm told this should be pretty 'easy' and just follow the vignette, but I'm finding some difficulties.

My situation is that everything seems to be working. I'm coming from this raw counts for a gene:

A_1: 900
A_2: 134
B_1: 14825
B_2: 8312

These are the normalized counts:

A_1: 784.8627
A_2: 203.5322
B_1: 19424.8883
B_2: 4966.5403

The sample table includes condition (A,A,B,B) and replicate (1,1,2,2). The design has been:

design = ~ condition + replicate

And the results have been obtained with:

results(dds,contrast=c("condition","A","B"))

How is it possible that for the gene with the counts above, I'm getting a positive log2FoldChange? When it should be more highly expressed in condition B according to the counts?

log2 fold change (MLE): condition A vs B 
Wald test p-value: condition A vs B 
baseMean:  6344.96
log2FoldChange: 1.95771  
lfcSE: 0.373301
stat: 5.2443
pvalue: 1.56873e-07
padj: 2.76881e-05

I must be missing something. Can anyone point me to the right direction or to some documentation apart from the vignette? I'm aware the replicates are showing variability (they're far in the PCA). Could that be causing this?

Thank you very much for the help

Deseq2 Fold-Change RNA-seq • 1.3k views

ADD COMMENT • link updated 7 months ago by Ram 43k • written 3.4 years ago by nanoide ▴ 120

score 2 · Answer 1 · 2020-11-23

2

Entering edit mode

3.4 years ago

i.sudbery 19k

It looks to me like your coding of replicate is wrong. Should replicate not be (1,2,1,2)? Its also not a good idea to use numbers for experimental factors in a model like this, as DESeq might interpret you as saying the "replicate" experimental factor is twice as high in replicate 1 as replicate 2. Using your design above, it would account for this in the model, and might be, effectively, dividing the sample Bs by two.

So, what happens if you code replicate at c("R1", R2", "R1", "R2")?

ADD COMMENT • link 3.4 years ago by i.sudbery 19k

0

Entering edit mode

You are right that the coding for replicates was wrong. Cannot believe it. Feels like I was blind. I guess it happens. Thanks for your time!

ADD REPLY • link 3.4 years ago by nanoide ▴ 120

2

Entering edit mode

This fully resolves the issue?

ADD REPLY • link 3.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Yes, along with other errors in the sample table. Just needed to pay more attention to it. Thanks for your time and apologize for the inconvenience

ADD REPLY • link 3.4 years ago by nanoide ▴ 120

1

Entering edit mode

No inconvenience at all, nanoide.

ADD REPLY • link 3.4 years ago by Kevin Blighe 87k

score 2 · Answer 2 · 2020-11-23

2

Entering edit mode

3.4 years ago

Barry Digby ★ 1.3k

If you want to control for the effect of Replicates, shouldn't the design be ~ replicate + condition?

I am basing this on the following workflow from Mike Love et al.

The simplest design formula for differential expression would be ~ condition, where condition is a column in colData(dds) that specifies which of two (or more groups) the samples belong to. For the airway experiment, we will specify ~ cell + dex meaning that we want to test for the effect of dexamethasone (dex) controlling for the effect of different cell line (cell).

ADD COMMENT • link 3.4 years ago by Barry Digby ★ 1.3k

1

Entering edit mode

I believe you are also right, thank you very much

ADD REPLY • link 3.4 years ago by nanoide ▴ 120

1

Entering edit mode

If the contrast is specified in the results call, as it is in the OP's code, the order of the design elements does not matter.

ADD REPLY • link 3.4 years ago by swbarnes2 14k

score 1 · Answer 3 · 2020-11-23

1

Entering edit mode

3.4 years ago

swbarnes2 14k

Do you really mean that the 1 samples and 2 samples are different batches? If the answer is no, drop replicate from your colData and design. Your design is just ~ condition.

ADD COMMENT • link 3.4 years ago by swbarnes2 14k