Question: RNAseq differential expression analysis results: Kallisto vs STAR
0
gravatar for marcos.georgiades.18
3 months ago by
marcos.georgiades.1810 wrote:

Hi all,

I am facing a minor dilemma, and I thought perhaps you could give me some advice. I have conducted differential expression analysis of a Control vs Mutant RNAseq dataset. I conducted the analysis twice, using different pipelines: PipelineA: Kallisto --> DESeq2 PipelineB: STAR --> featureCounts --> DESeq2

I wanted to get a sense of how different the results would be when "classifying against a transcriptome" and when "quantifying against a genome". PipelineA outputs ~2000 differentially expressed genes. PipelineB outputs ~1600, of which ~1400 are also identified as differentially expressed by PipelineA. Filtering conditions for significance (e.g. FDR < 0.05) were kept the same for both.

My question is, which results should I trust? I read the transcriptome path is usually more accurate, but perhaps it doesn't hurt to be a bit conservative?

Many thanks:), Marcos

star rna-seq kallisto • 341 views
ADD COMMENTlink modified 3 months ago by Istvan Albert ♦♦ 84k • written 3 months ago by marcos.georgiades.1810
1

I suggest you read the papers of the pseudoalignment tools such as kallisto and salmon plus the recent papers that benchmark these different pipelines. Currently the field seems to prefer the pseudoalignment methods. Details in the papers.

ADD REPLYlink written 3 months ago by ATpoint36k
1

PipelineA probably gives you more total detected genes and more "counts" per gene so that may explain higher number of differentially expressed genes.

Regardless, your overlap is very high.

ADD REPLYlink modified 3 months ago • written 3 months ago by igor11k
1

As an additional comment: I recommend checking out sleuth for performing differential gene expression analysis with kallisto.

ADD REPLYlink written 3 months ago by dsull1.4k
3
gravatar for Istvan Albert
3 months ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

The two methods are complementary - thus your can't quite think about it as one is more "trustworthy" than the other. You are measuring different things.

Each one separately, or both could be right and wrong. All at the same time.

There are various tradeoffs in each, as igor puts it, the overlap is already high.

I always recommend that people do both, pseudo alignments with Salmon/Kallisto then look at the genomic alignments for each transcript that turns out to have relevance. The alignments at genome level are more informative and can help you decide how much to trust the quantification.

The problem with both Salmon and Kallisto is that the read reassignment is somewhat of a black box, it is not easy to track why a multi mapping read is assigned where it is and how strong the evidence and how reliable the process was, how big the errors etc.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Istvan Albert ♦♦ 84k

Thanks, this is helpful:)!

ADD REPLYlink written 3 months ago by marcos.georgiades.1810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1393 users visited in the last hour