I am performing a differential expression analysis with deseq2, but I have firstly to take into account batch effects.
I have an info file with several technical confounders and other information for the samples (twins), like family and zygosity.
As you suggested, I am using the svaseq function for correcting the batch effects, according to:
this is my workflow:
dds <- DESeqDataSetFromMatrix(countData = counts,
colData = info,
design = ~ condition)
dat <- counts(dds, normalized=TRUE)
mod <- model.matrix(~ condition, info)
mod0 <- model.matrix(~ 1, info)
svseq <- svaseq(dat, mod, mod0, n.sv=2)
I understand that this way I will clean the dataset, but how can I take into account also the family relatedness? Can I add this information in the model (mod?) or can I do it in the following steps of the differential expression analysis? including in the model not only condition and surrogate variables (do they have to be 2?), but also family or zygosity?
design <- ~ SV1 + SV2 + fam + condition