DESeq2: Transform count data by VST and Covariates
0
0
Entering edit mode
5.1 years ago

Hello,

I'm working on a simple bulk RNA-seq project from brain tissue and would like to use WGCNA downstream of DEG analysis with DESeq2. I'm using the TxImport to DESeq2 approach.

The documentation nicely illustrates how to use VST or rlog to get normalized data for downstream analyses, and involving covariates in the DEG analysis is also nicely integrated. However, I cannot figure out how to get my normalized values to be weighted by the covariates matrices or even how to extract the covariate matrix to get this weighted table myself. I'm specifically concerned with integrating one continuous covariate (RIN) and two categorical covariate (Gender and Batch). I apologize if this has already been answered, but I'm having a hard time finding the solution. Please respond if anyone knows how to do this. Thanks!

RNA-Seq DESeq2 Covariates Normalization • 2.7k views
ADD COMMENT
0
Entering edit mode

Do you want to regress-out the effect of those? Check removeBatchEffect from limma or Combat-Seq from sva package.

ADD REPLY
0
Entering edit mode

I do want to regress out the effects. I've been having trouble getting ComBat to work, but I'm going to give it another shot with the VST data. Unfortunately, ComBat only seems to be able to deal with categorical covariates, so I was hoping there was a better way as the majority of my noise is coming from RIN (continuous). If it comes to it, I'll switch over to limma but I was hoping there was a nice solution through DESeq2.

ADD REPLY
0
Entering edit mode

DESeq2 does not help here since it never manipulates the normalized counts directly but includes them into its GLM. Combat-Seq is pretty simple for the end user, it operates on the raw counts, and these you can then put into vst, or simply limma on the vst directly. Both support categorical and continuous variables fwik. I personally use Combat-Seq these days simply because it does not create these annoying negative values for lowly expressed genes with expression close to zero.

ADD REPLY
0
Entering edit mode

Thanks for the replies. I went with Limma because it was suggested in the DESeq2 manual and because I've had issues with getting ComBat to work in the past. Plus, my version of R or RStudio (I can never tell which when things don't install correctly) doesn't seem to support Combat-Seq. At any rate, I think it worked, but if you (or anyone else in internet land) can check to see if it's done correctly, that'd be huge. (NOTE: for sake of space, I'm only showing 8 of 44 samples)

head(sample_table_affective_FA, 3)
  Gender Batch affective  apathy    hida psychosis RIN RIN_dichot
1   Male   1st      case control control   control 9.5         >8
2   Male   3rd   control control    case      null 9.7         >8
3 Female   1st      case    null    case      case 9.0         >8

#Variance Stabilization of data that was already made into DESeq object
vsd.aff.FA <- vst(dds_aff_FA, blind = FALSE)
exp.aff.FA <-assay(vsd.aff.FA)

head(exp.aff.FA, 3)
       sample1   sample2   sample3   sample4   sample5  sample6   sample7   sample8
A1BG  9.124096  7.787003  7.604343  8.996111  7.592220  7.22496  7.792047  7.719206
A1CF 10.284002 10.165805  9.713925  9.371964  9.786273 10.22923  9.474686  9.705198
A2M  13.500474 13.246708 11.401156 12.868097 11.991867 13.33890 12.461733 12.265468

#Batch and Covariate objects
gender.aff <- sample_table_affective_FA['Gender']
batch.aff <- sample_table_affective_FA['Batch']
RIN.aff <- sample_table_affective_FA['RIN']

#Putting batch/covariates in character or numeric forms
gender.aff <- as.character(t(as.vector(gender.aff)))
batch.aff <- as.character(t(as.vector(batch.aff)))
RIN.aff <- as.numeric(t(as.vector(RIN.aff)))

head(gender.aff)
[1] "Male"   "Male"   "Female" "Male"   "Female" "Female"
head(batch.aff)
[1] "1st" "3rd" "1st" "2nd" "1st" "2nd"
head(RIN.aff)
[1] 9.5 9.7 9.0 9.1 8.6 9.3

#Correcting for covariates
exp.aff.FA.final <- limma::removeBatchEffect(exp.aff.FA, batch=gender.aff, batch2=batch.aff, covariates = RIN.aff)

head(exp.aff.FA.final, 3)
       sample1   sample2   sample3   sample4   sample5   sample6   sample7   sample8
A1BG  8.791986  7.612144  7.605691  8.611286  7.599253  7.163643  7.767688  7.327274
A1CF  9.334835  9.194941  8.886571  8.598555  8.995373  9.513837  8.996243  8.886223
A2M  12.422234 12.202608 10.465980 11.753121 11.100578 12.290186 11.698285 11.095634

Is that about right? Again, thanks for all the help!

ADD REPLY
0
Entering edit mode

Guess that is ok, see also https://support.bioconductor.org/p/85202/ to maybe have it all a bit "cleaner".

ADD REPLY
0
Entering edit mode

Excellent! Thanks so much!

ADD REPLY

Login before adding your answer.

Traffic: 3526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6