I am performing a differential expression analysis with deseq2, but I have firstly to take into account batch effects. I have an info file with several technical confounders and other information for the samples (twins), like family and zygosity. As you suggested, I am using the svaseq function for correcting the batch effects, according to: http://www.bioconductor.org/help/workflows/rnaseqGene/#batch
This is my workflow:
dds <- DESeqDataSetFromMatrix(countData = counts, colData = info, design = ~ condition) dds<-DESeq(dds) dat <- counts(dds, normalized=TRUE) mod <- model.matrix(~ condition, info) mod0 <- model.matrix(~ 1, info) svseq <- svaseq(dat, mod, mod0, n.sv=2)
I understand that this way I will clean the dataset, but how can I take into account also the family relatedness? Can I add this information in the model (mod?) or can I do it in the following steps of the differential expression analysis? including in the model not only condition and surrogate variables (do they have to be 2?), but also family or zygosity?
design <- ~ SV1 + SV2 + fam + condition