Question

Recommendations for miRNA-Seq quantification with Salmon

1

Entering edit mode

7.1 years ago

labadorf ▴ 10

Hi,

A bit of background: I have started exploring Salmon as a RNA quantification tool for miRNA-Seq datasets.

My previous experiences with Salmon quantification of mRNA-Seq suggests it is more accurate than more traditional align and count (AaC) strategies, but noticed some rare instances where Salmon estimates many magnitudes more reads mapping to a gene than my AaC methods (STAR+VERSE).

When I quantify miRNA-Seq datasets against references built using miRBase with either mature miRNA or miRNA primary transcript sequences I notice the same trend, namely many (ususally more lowly abundant) transcripts have much higher estimated read abundance by Salmon than AaC. More abundant miRNA species are much more consistent between methods.

Since this is "live" data, I really have no idea how to assess which method is more accurate, so the best I can do is optimize my Salmon parameters for miRNA-Seq data and then follow up later in wet lab experiments.

I think I understand the Salmon method on a high level, but not deeply enough that I know how the various parts of the inference algorithm may influence the results specifically for miRNA-Seq data.

My questions are:

Can anyone give any advice or recommendations on what parameters for both index creation and quasi-mapping analysis? Other than setting the k-mer size sufficiently low (I used 11) for creating the index?
Are there any considerations I should take when interpreting the results of miRNA-Seq vis a vis mRNA-Seq read estimates?

Thanks.

miRNA-Seq salmon transcript quantification • 4.0k views

ADD COMMENT • link 7.1 years ago by labadorf ▴ 10

0

Entering edit mode

Hi Labadorf, I am working on the same process. I think your relatively high abundance of miRNA may cause by the mapping reference. Salmon recommends mapping the read to the transcriptome, not the genome. Considering smallRNA-seq library construction. I firstly align reads using bowtie1. Then, extracting miRNA regions from the BAM files. Converting them to fq files and use them as inputs of Salmon for downstream analysis. Hope it will help. And I am looking forward to your solution to the topic.

Thanks.

ADD REPLY • link 4.9 years ago by chengrui7 • 0