Forum: Kallisto New RNA-seq quantification method discussion
9
gravatar for morovatunc
3.4 years ago by
morovatunc400
Turkey
morovatunc400 wrote:

Dear all Hi,

I would like to get your opinion about this interning tool which might effect RNA-seq era. Its biggest advantage seems to be its tremendous speed. Also, it performs quite well with its competitors. Has any of you got a chance to use it ? If so could you share some feedback with us ? Because article seems very interesting. If it is okay with the website policy I would like to get it going as a discussion.

Thanks,

Tunc.

Problem: I cannot get in to its nature page because it redirects me in to the home page of nature so I shared a news about this problem.

https://liorpachter.wordpress.com/2015/05/10/near-optimal-rna-seq-quantification-with-kallisto/

rna-seq forum • 6.6k views
ADD COMMENTlink modified 2.2 years ago by FatihSarigol120 • written 3.4 years ago by morovatunc400
2

There is salmon (and sailfish), if you are interested in this class of tools.

ADD REPLYlink written 3.4 years ago by genomax70k
5

For those who don't know, sailfish has recently been heavily updated and a lot of the cooler aspects of Salmon integrated into it. If someone reads the Kallisto paper they should note that the sailfish comparisons are largely meaningless for determining results with a current version (details here).

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Devon Ryan91k
1

It is true that Sailfish was updated after the kallisto paper was submitted. The current version (0.9.2) incorporates many of the key elements of kallisto (pseudoalignment, the kallisto bias correction and the kallisto effective length correction) so that it is now practically identical to kallisto in the underlying algorithm (and therefore in the results produced).

The comparisons to Sailfish in the kallisto are meaningful insofar as they show definitively that the Sailfish algorithm based on k-mer matching (published in Nature Biotechnology) is inferior to read pseudoalignment that underlies kallisto (and now Sailfish).

ADD REPLYlink written 3.4 years ago by Lior Pachter330
4
gravatar for Devon Ryan
3.4 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

We've had a couple projects that have given both Kallisto and Salmon a try. We've generally gone with Salmon, which is not to say that Kallisto is bad. Since tximport will be in the next R release, I expect we'll switch the majority of our "standard" RNAseq analyses to either Salmon or Kallisto in the next year or so.

ADD COMMENTlink written 3.4 years ago by Devon Ryan91k
1

I would like to add that there are 2 papers already that have been out for some time this year based on the quantification and the other one based on quantification and differential expression and can also be used for benchmarking, both comes with a website as well to use an add up new methods to benchmark. Take a look and might be quite useful in selecting the quantification pipelines and downstream DE tool for inferring differences in transcriptional programs.

rnaseqcomp

RNAontheBENCH

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by ivivek_ngs4.8k

Any specific reason that you tend to favor Salmon? I'll be starting RNA-Seq data soon so i'm curious.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Sinji2.8k
4

It happened to give more reliable results on a dataset of interest that we tested it on. There's the added benefit that Rob Patro is super responsive (and active on this site), though that wasn't the deciding factor.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Devon Ryan91k

And maybe the ram requirement of the STAR? It needs high amount of RAM.

ADD REPLYlink written 3.4 years ago by morovatunc400
1

You don't need STAR to use Salmon. You can certainly give Salmon a BAM file, but you can also just give it fastq files (as is the case with Kallisto).

ADD REPLYlink written 3.4 years ago by Devon Ryan91k

There is an active user group for kallisto-sleuth where questions are quickly answered https://groups.google.com/forum/#!forum/kallisto-sleuth-users

ADD REPLYlink written 3.4 years ago by Lior Pachter330

I still need time to come to the point where I can start using kallisto. Their indexing method and especially k-comptatibailty is hard to comprehend. Thank you for your answer !

ADD REPLYlink written 3.4 years ago by morovatunc400

While the details of how pseudoalignment is performed are slightly technical, what it means is explained in this blog post: https://liorpachter.wordpress.com/tag/pseudoalignment/

ADD REPLYlink written 3.4 years ago by Lior Pachter330

For differential analysis it is strongly recommended to use Sleuth; see this thread Can Kallisto be followed by DESeq, EdgeR or Cuffdiff?

ADD REPLYlink written 3.4 years ago by Lior Pachter330
4
gravatar for lkmklsmn
3.4 years ago by
lkmklsmn890
United States
lkmklsmn890 wrote:

I find the pseudo alignment approach (kallisto, salmon, sailfish) very innovative. However, I would like to point out that RNA-seq data carries a lot more information than just gene expression levels. In my opinion the gene-level output of RNA-seq data is an alignment and not just an expression estimate. RNA-seq alignments carry information on allele specific expression, alternative splicing (junction reads) and give you the opportunity to visualize the raw data. Since you never know if you may want to look at some of these aspects at a later point in time, I have been hesitant to use these pseudo aligners in my "standard" workflow.

ADD COMMENTlink written 3.4 years ago by lkmklsmn890

Just to clarify kallisto, salmon etc work at transcript level and not gene level!

While It is true that this may be a limitation in some situations more than others, it is not nearly as worrisome as if it were indeed gene level pseudo alignment.

ADD REPLYlink written 3.4 years ago by Istvan Albert ♦♦ 81k
3
gravatar for FatihSarigol
2.2 years ago by
FatihSarigol120
Durham
FatihSarigol120 wrote:

There is a 2017 Bmc Bioinformatics paper evaluating 219 combinatorial implementations of the most commonly used analysis tools for their impact on differential gene expression analysis by RNA-Seq, including Kallisto and to me it looks very good:

"Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq"

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1457-z

They also share their scripts which is very nice, for instance a Perl script for aligning and modeling with Kallisto with the settings they used in the paper is below:

https://github.com/cckim47/kimlab/blob/master/rnaseq/alignAndModel/alignAndModel_KaKa.pl

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by FatihSarigol120
2
gravatar for WouterDeCoster
3.4 years ago by
Belgium
WouterDeCoster40k wrote:

I have used it. Very easy to work with, very quick. Works nicely together with sleuth. Reason I haven't dived further into it is because it automatically performs transcript length based normalization, which is not applicable nor desired for my type of data (QuantSeq 3', Lexogen).

ADD COMMENTlink written 3.4 years ago by WouterDeCoster40k
1
gravatar for mgalland1983
3.4 years ago by
mgalland198310
Netherlands
mgalland198310 wrote:

I've used kallisto to compute RNA-Seq expression values (both normalized counts and TPM values). I was particularly interested since I am working on multiple closely related species (tomatoes) where alignment is not perfect since mapping rates fall depending on the genetic distance.

For my work I use one unique reference which cause problems since I'm working on multiple species more or less closely related to this reference. I'm also based on the Proton Ion platform which might cause differences with Illumina users (especially due to homopolymers/insertions/deletions that are frequent in Ion Proton reads).

With Kallisto + reference transcriptome, the fraction of reads mapped ranged from 54 to 76%. which is pretty good in my opinion. With STAR + reference genome (not allowing multimapping reads, 2 mismatches allowed), mapping rates ranged from 34 to 70% due to too many mismatches from both technical origin (Proton Ion) and genetic distance between my species. Since I've mapped to the genome and not to the transcriptome and also due to the less mainstream Proton ion reads, I guess it is hard to compare.....I'm currently working on other methods (TMAP aligner) to compare results.

So far, looking at gene expression from specific enzymes, most of them behave as expected (enzymes linked to metabolite production are expressed accordingly across genotypes).

I'm still in the process of comparing mainstream aligners to Kallisto pseudoalignment.

Looking forward for additional insights from this forum

ADD COMMENTlink written 3.4 years ago by mgalland198310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1151 users visited in the last hour