Forum: Kallisto New RNA-seq quantification method discussion
gravatar for morovatunc
4.2 years ago by
morovatunc460 wrote:

Dear all Hi,

I would like to get your opinion about this interning tool which might effect RNA-seq era. Its biggest advantage seems to be its tremendous speed. Also, it performs quite well with its competitors. Has any of you got a chance to use it ? If so could you share some feedback with us ? Because article seems very interesting. If it is okay with the website policy I would like to get it going as a discussion.



Problem: I cannot get in to its nature page because it redirects me in to the home page of nature so I shared a news about this problem.

rna-seq forum • 7.4k views
ADD COMMENTlink modified 3.1 years ago by FatihSarigol160 • written 4.2 years ago by morovatunc460

There is salmon (and sailfish), if you are interested in this class of tools.

ADD REPLYlink written 4.2 years ago by genomax85k

For those who don't know, sailfish has recently been heavily updated and a lot of the cooler aspects of Salmon integrated into it. If someone reads the Kallisto paper they should note that the sailfish comparisons are largely meaningless for determining results with a current version (details here).

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Devon Ryan95k

It is true that Sailfish was updated after the kallisto paper was submitted. The current version (0.9.2) incorporates many of the key elements of kallisto (pseudoalignment, the kallisto bias correction and the kallisto effective length correction) so that it is now practically identical to kallisto in the underlying algorithm (and therefore in the results produced).

The comparisons to Sailfish in the kallisto are meaningful insofar as they show definitively that the Sailfish algorithm based on k-mer matching (published in Nature Biotechnology) is inferior to read pseudoalignment that underlies kallisto (and now Sailfish).

ADD REPLYlink written 4.2 years ago by Lior Pachter520
gravatar for Devon Ryan
4.2 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

We've had a couple projects that have given both Kallisto and Salmon a try. We've generally gone with Salmon, which is not to say that Kallisto is bad. Since tximport will be in the next R release, I expect we'll switch the majority of our "standard" RNAseq analyses to either Salmon or Kallisto in the next year or so.

ADD COMMENTlink written 4.2 years ago by Devon Ryan95k

I would like to add that there are 2 papers already that have been out for some time this year based on the quantification and the other one based on quantification and differential expression and can also be used for benchmarking, both comes with a website as well to use an add up new methods to benchmark. Take a look and might be quite useful in selecting the quantification pipelines and downstream DE tool for inferring differences in transcriptional programs.



ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by ivivek_ngs4.9k

Any specific reason that you tend to favor Salmon? I'll be starting RNA-Seq data soon so i'm curious.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Sinji3.0k

It happened to give more reliable results on a dataset of interest that we tested it on. There's the added benefit that Rob Patro is super responsive (and active on this site), though that wasn't the deciding factor.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Devon Ryan95k

And maybe the ram requirement of the STAR? It needs high amount of RAM.

ADD REPLYlink written 4.2 years ago by morovatunc460

You don't need STAR to use Salmon. You can certainly give Salmon a BAM file, but you can also just give it fastq files (as is the case with Kallisto).

ADD REPLYlink written 4.2 years ago by Devon Ryan95k

There is an active user group for kallisto-sleuth where questions are quickly answered!forum/kallisto-sleuth-users

ADD REPLYlink written 4.2 years ago by Lior Pachter520

I still need time to come to the point where I can start using kallisto. Their indexing method and especially k-comptatibailty is hard to comprehend. Thank you for your answer !

ADD REPLYlink written 4.2 years ago by morovatunc460

While the details of how pseudoalignment is performed are slightly technical, what it means is explained in this blog post:

ADD REPLYlink written 4.2 years ago by Lior Pachter520

For differential analysis it is strongly recommended to use Sleuth; see this thread Can Kallisto be followed by DESeq, EdgeR or Cuffdiff?

ADD REPLYlink written 4.2 years ago by Lior Pachter520
gravatar for lkmklsmn
4.2 years ago by
United States
lkmklsmn930 wrote:

I find the pseudo alignment approach (kallisto, salmon, sailfish) very innovative. However, I would like to point out that RNA-seq data carries a lot more information than just gene expression levels. In my opinion the gene-level output of RNA-seq data is an alignment and not just an expression estimate. RNA-seq alignments carry information on allele specific expression, alternative splicing (junction reads) and give you the opportunity to visualize the raw data. Since you never know if you may want to look at some of these aspects at a later point in time, I have been hesitant to use these pseudo aligners in my "standard" workflow.

ADD COMMENTlink written 4.2 years ago by lkmklsmn930

Just to clarify kallisto, salmon etc work at transcript level and not gene level!

While It is true that this may be a limitation in some situations more than others, it is not nearly as worrisome as if it were indeed gene level pseudo alignment.

ADD REPLYlink written 4.2 years ago by Istvan Albert ♦♦ 84k
gravatar for FatihSarigol
3.1 years ago by
FatihSarigol160 wrote:

There is a 2017 Bmc Bioinformatics paper evaluating 219 combinatorial implementations of the most commonly used analysis tools for their impact on differential gene expression analysis by RNA-Seq, including Kallisto and to me it looks very good:

"Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq"

They also share their scripts which is very nice, for instance a Perl script for aligning and modeling with Kallisto with the settings they used in the paper is below:

ADD COMMENTlink modified 3.0 years ago • written 3.1 years ago by FatihSarigol160
gravatar for WouterDeCoster
4.2 years ago by
WouterDeCoster44k wrote:

I have used it. Very easy to work with, very quick. Works nicely together with sleuth. Reason I haven't dived further into it is because it automatically performs transcript length based normalization, which is not applicable nor desired for my type of data (QuantSeq 3', Lexogen).

ADD COMMENTlink written 4.2 years ago by WouterDeCoster44k

Hi! thanks for the comment. I am currently working with 3' RNA SE seq data and I cannot decide on alignment method.. could you please recommend something? Its a human cancer, and I do STAR alignment for sure, but I was wondering if Kallisto could work as well; but it requires fragment length and sd inforlation for single read mode which i dont know. I know that Salmon doesnt need it, I plan to try it. I also have some doubts about trimming adapters and removing poly A, which leads to A bias in the data.. could you please tell if you do anything with it? I read once that its better not do to any. Its also is QUantSeq 3', Lexogen. Sorry for the bunch of questions its my first time working with rna data and I just want to make sure if my analysis is relevant. Many thanks!

ADD REPLYlink written 8 months ago by dhlsl0
gravatar for mgalland1983
4.2 years ago by
mgalland198310 wrote:

I've used kallisto to compute RNA-Seq expression values (both normalized counts and TPM values). I was particularly interested since I am working on multiple closely related species (tomatoes) where alignment is not perfect since mapping rates fall depending on the genetic distance.

For my work I use one unique reference which cause problems since I'm working on multiple species more or less closely related to this reference. I'm also based on the Proton Ion platform which might cause differences with Illumina users (especially due to homopolymers/insertions/deletions that are frequent in Ion Proton reads).

With Kallisto + reference transcriptome, the fraction of reads mapped ranged from 54 to 76%. which is pretty good in my opinion. With STAR + reference genome (not allowing multimapping reads, 2 mismatches allowed), mapping rates ranged from 34 to 70% due to too many mismatches from both technical origin (Proton Ion) and genetic distance between my species. Since I've mapped to the genome and not to the transcriptome and also due to the less mainstream Proton ion reads, I guess it is hard to compare.....I'm currently working on other methods (TMAP aligner) to compare results.

So far, looking at gene expression from specific enzymes, most of them behave as expected (enzymes linked to metabolite production are expressed accordingly across genotypes).

I'm still in the process of comparing mainstream aligners to Kallisto pseudoalignment.

Looking forward for additional insights from this forum

ADD COMMENTlink written 4.2 years ago by mgalland198310
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1096 users visited in the last hour