Question

Is it appropriate to apply RUVseq on output from kallisto?

0

Entering edit mode

5.3 years ago

unawaz ▴ 60

Hi,

I've ran kallisto on my samples, and summarized the transcript level counts to gene level counts to perform a differential gene expression analysis. I'm also interested in differential transcript expression, which I will perform using Sleuth.

Exploratory analysis at the gene level revealed that a few of the samples were clustering by batch. Previously I've applied RUVg on my counts, but the counts were produced by featureCounts and the analysis was only at the gene level. I wanted to know if it's appropriate to use RUVSeq with counts produced from kallisto? I will very likely need to apply this at the transcript level as well. We would also have to create in silico empirical control genes (which are normally created by performing a DE analysis).

Otherwise are there any alternative methods I can use to remove unwanted variation for data produced by kallisto?

RNA-Seq batch-correction kallisto • 2.0k views

ADD COMMENT • link updated 13 days ago by Ram 43k • written 5.3 years ago by unawaz ▴ 60

score 0 · Answer 1 · 2019-01-17

It sounds like you already know the batch effect? If that is the case you should not use RUVg (or SVA) instead:

For DE: You simply add the batch effect as a factor in the DE model - then it will be ignored when you test for the difference caused by your experimental conditions.

For EDA: you need to remove the batch effect and RUV, SVA::combat or limma::removeBatchEffects can all help you with this. All methods work on Kallisto data (although each need different input) and both on gene and transcript level.

Btw if you are interested in transcript analysis you can, with the data you already have, directly analyse fx isoform switches - something my R package IsoformSwitchAnalyzeR can help you with (and it also handles batch effects). You can find examples of what type of analysis you can do in this section of the vignette.