4.2 years ago by
I've used kallisto to compute RNA-Seq expression values (both normalized counts and TPM values). I was particularly interested since I am working on multiple closely related species (tomatoes) where alignment is not perfect since mapping rates fall depending on the genetic distance.
For my work I use one unique reference which cause problems since I'm working on multiple species more or less closely related to this reference. I'm also based on the Proton Ion platform which might cause differences with Illumina users (especially due to homopolymers/insertions/deletions that are frequent in Ion Proton reads).
With Kallisto + reference transcriptome, the fraction of reads mapped ranged from 54 to 76%. which is pretty good in my opinion.
With STAR + reference genome (not allowing multimapping reads, 2 mismatches allowed), mapping rates ranged from 34 to 70% due to too many mismatches from both technical origin (Proton Ion) and genetic distance between my species. Since I've mapped to the genome and not to the transcriptome and also due to the less mainstream Proton ion reads, I guess it is hard to compare.....I'm currently working on other methods (TMAP aligner) to compare results.
So far, looking at gene expression from specific enzymes, most of them behave as expected (enzymes linked to metabolite production are expressed accordingly across genotypes).
I'm still in the process of comparing mainstream aligners to Kallisto pseudoalignment.
Looking forward for additional insights from this forum