Question: miRNA/smallRNA adapter triming: Any recommendations for adaptor error rate and sequence Phred?
gravatar for ti&te
21 months ago by
ti&te20 wrote:

I am looking for recommendations on how to trim miRNA/smallRNA sequencing data because the trimming may affect the final results (differences are not enormous, but some miRNA are more prone to different parameters in trimming step).

For sequence data trim, I use cutadapt, with minimal sequence length 15, sequence end quality trim Phred20 before adaptor removal and error rate in adaptor detection 0.1 (cutadapt -m 15 -q 20 -e 0.1). The more stringent parameters with (q30 error rate 0.01) give less mature miRNA (logically), but higher differences in DE in downstream analysis.

It is known that miRNA need less stringent parameters due to more sequencing noise compared to other RNA and DNA data, so I would be grateful for sharing your experience.

sequencing rna-seq next-gen • 855 views
ADD COMMENTlink modified 21 months ago • written 21 months ago by ti&te20

I work with small RNA data. I think the value of m is very short especially if you are only looking into miRNAs. We typically have a min cutoff of 18 and a maximum cutoff of 34. Otherwise, the q and e values that you used are reasonable. Can you share some references talking about the need for miRNA data processing to use less stringent parameters.

ADD REPLYlink written 21 months ago by S90

Thank you for your reply. The reported and recommended miRNA length may differ and some miRNA analysis tools sort aligned reads to mature miRNA, isomiRNA and miRNA hairpins, so the shorter length should not be a problem for downstream analysis. (prof. Hackenberg's presentation)

As you have probably seen that FastQC of your data differ compared to other longer reads experiments - that is the result of a difference in length of your smallRNA-library products, remaining adaptor dimers and low diversity of library due to highly expressed miRNA in your samples.

Please find some publications with Q20 trimming before further processing.

ADD REPLYlink written 21 months ago by ti&te20
gravatar for ahmad mousavi
21 months ago by
ahmad mousavi480
Royan Institute, Tehran, Iran
ahmad mousavi480 wrote:


I have used mirdeep2 for preprocessing/post processing, You could use following command for removing adapters : reads_qseq.txt -b -h -i -j -k TCGTATGCCGTCTTCTGCTTGT -l 18 -m -s reads_collapsed.fa

-a              input file is seq.txt format
-b              input file is qseq.txt format
-c              input file is fasta format
-e              input file is fastq format
-d              input file is a config file (see miRDeep2 documentation).
                options -a, -b or -c must be given with option -d.
-g              three-letter prefix for reads (by default 'seq')
-h              parse to fasta format
-i              convert rna to dna alphabet (to map against genome)
-j              remove all entries that have a sequence that contains letters
                other than a,c,g,t,u,n,A,C,G,T,U,N
-k seq          clip 3' adapter sequence
-l int          discard reads shorter than int nts
-m              collapse reads
ADD COMMENTlink written 21 months ago by ahmad mousavi480
gravatar for ti&te
21 months ago by
ti&te20 wrote:

Thank you for the suggestion, but I still can't find the data how is with Phred end trimming and error rate in adaptor recognition.

ADD COMMENTlink written 21 months ago by ti&te20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 729 users visited in the last hour