How does deseq2 encode more than 2 levels
Entering edit mode
6.5 years ago
-_- ★ 1.1k

When there are two levels per factor, it could be encoded as 0 and 1. What about 3 factors, then? Is it one-hot encoding or something like that when DESeq fit a generalized linear model over the factors? I don't find such information in the paper or user guide yet.

If you could even point me in the source code, that would even better. Thanks.

RNA-Seq DESeq2 differential expression • 1.4k views
Entering edit mode
6.5 years ago
-_- ★ 1.1k

DESeq2 uses model.matrix so you can just plug your design and colData into this base R function to see how it will be encoded.

Quoted from

> model.matrix(~participant+sampleType, coldata)
             (Intercept) participantX8326 participantX8329 sampleTypetumor
X8324_normal           1                0                0               0
X8324_tumour           1                0                0               1
X8326_normal           1                1                0               0
X8326_tumour           1                1                0               1
X8329_normal           1                0                1               0
X8329_tumour           1                0                1               1

So it's not really one-hot encoding, but something like it, where it uses [0, 0] to represent participant X8324.


Login before adding your answer.

Traffic: 2226 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6