Hi,
I'm trying to remove the effect of a confounder (sex) and surrogate variables on my dataset which is from an RNAseq experiment. I want to correctly predict disease_status by using my count data. I'm trying to use formulas from sva and ruvseq to remove the confounding effect of my sex covariate and surrogate variables. However, both packages give me the following error: Error in solve.default(xtwx + ridge) : system is computationally singular: reciprocal condition number = 5.86307e-17
.
I've read that one can receive this error when two covariates are highly colinear or because the number of cases is approximately as large as the number of covariates. Both are not true for my dataset so I don't know why I'm receiving this error. Does anyone have an idea? My code lines are as follows for the sva method:
dds<-DESeqDataSetFromMatrix(df,colData =coldata, design = ~sex+disease_status)
dds <- estimateSizeFactors(dds)
dds<-DESeq(dds)
res<-results(dds)
rld <- vst(dds, blind=FALSE)
dat <- counts(dds, normalized = TRUE)
idx <- rowMeans(dat) > 1
dat <- dat[idx, ]
mod <- model.matrix(~sex+disease_status, colData(dds))
mod0 <- model.matrix(~sex, colData(ddsFiltered))
svseq <- svaseq(dat, mod, mod0)
This results in:
Error in solve.default(xtwx + ridge) : system is computationally singular: reciprocal condition number = 5.86307e-17
Does anyone know how to solve this error?
Why the hassle with SVA, and not just removing that effect from
rld
with something likelimma::removeBatchEffect
? SVA in my view is more to estimate and then remove unknown sources of variation, but your sex covariate is already known, so you can regress it directly.I want to remove both known eg sex and unknown sources of variation. But to your point, if I wanted to just remove sex as covariate, would the following code lines make sense?:
I guess so.