What are the best quantification tools for miRNA seq?
0
0
Entering edit mode
7 weeks ago
cecilomar6 • 0

First time poster, sorry in advance for any mistakes.

I have some experience managing RNA seq data for differential expression analysis. I usually follow this pipeline: Pseudoalignment with salmon --> Import counts to R with tximport --> Differential expression analysis with DESeq2

However, I've been asked to do a similar analysis with miRNA seq data and I'm having some trouble with it. The miRNAs were extracted form human peripheral blood and were sequenced with Ilumina technology. I've created an index for salmon with the mature miRBase database and a kmer of 7, as the sequences I'm aligning are very short, and I'm obtaining mapping rates that vary from 1% to 40% for different samples. I don't know why there is such a big difference in the mapping rates. I've tried redoing the index with different kmer values, with the hairpin database instead, changing some parameters in salmon... But the results haven't imporved.

I'm very new to the miRNA world and I haven't been able to find much information about proper miRNA seq analysis. I will be very thankful for any explanation to this problem, recommendation on different tools or how to use salmon for this specific case.

miRNA RNA-Seq Salmon miRNA-seq • 192 views
ADD COMMENT
0
Entering edit mode

Can you mention the whole workflow you followed? starting from trimming and adapter removal! We had the same problem, but it was one of the parameters in UMI-tools that gave us low reads.

ADD REPLY
0
Entering edit mode

The samples were sequenced by BGI genomics, and the fasq files they send you are supposedly already cleaned from any UMI or adapter sequence, so my workflow started at the alignment step. What program can I use to check if they were correctly removed? Thanks!

ADD REPLY
0
Entering edit mode

Okay, let's assume that the trimming was perfect. Can you try this workflow?

  1. Build Index using Bowtie2
  2. Bowtie2 Mapping
  3. Index (using Bowtie2)
  4. remove duplicates using UMI-dedup
  5. Count using Htseq

try it with 1 or 2 fastq files and If the problem still persists, then please ask the command they used for adapter removal.

ADD REPLY
0
Entering edit mode

Thank you very much for your response! I've tried the pipeline you suggested, aligning and counting to an index created with miRBase gives me very similar results to my original pipeline. Aligning to the human genome dramatically improves my mapping rate (from 5 to 40%), but again after counting with miRBase the results are very similar to the original ones. Maybe these sampels are very low quality, or have been contaminated. I don't think is a matter of adapter removal as the peak in sequence length is in 22-24bp.

ADD REPLY
0
Entering edit mode

Did you check the contamination with FASTQ-Screen?

ADD REPLY
0
Entering edit mode

This is the result I obtained from a representative sample in fastq screen

https://ibb.co/KmkkFGf

There seems to be a bit of contamination, but overall still 95% of reads are no hit

ADD REPLY

Login before adding your answer.

Traffic: 2353 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6