Question

Very low alignment rate for miRNAs reads against known miRNAs from mirbase

0

Entering edit mode

3.0 years ago

LynxLynx • 0

*Hi all! I know that there are some similar topics and I read them but it didn’t solve my issue…

I’m working with smRNA-seq data (plants) and my aim is to find novel and known miRNAs, then perform DE expression analysis and find some correlations between miRNA and mRNA expression profiles.

I preprocced smRNA reads: 1) trimmed adapters using Illumina documentation I found only one adapter sequence actually and removed it with cutadapt. Command:

~/.local/bin/cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -o trimmed.fq.gz in.fq.gz

98.5% reads with adapters, 41,2% total written

2) then I filtered tRNA and rRNA contamination using bbduk.sh

3) kept reads that are from 19 to 26 nt using cutadapt

Original reads: 17 483 425 sequences Clean reads: 6 592 972 sequences clean reads length distribution

And here is the issue. I aligned my clean reads against precursor miRNA sequences downloaded from mirbase to identify known miRNAs. I used both bowtie and bbduk.sh but alignment rate is very low, about 5%. What could cause it?

smRNA • 930 views

ADD COMMENT • link updated 3.0 years ago by Jeremy Leipzig 22k • written 3.0 years ago by LynxLynx • 0

0

Entering edit mode

Did you replace the U bases with T (e.g. sed 's/U/T/g') in miRBase sequences that you downloaded before doing the alignments?

ADD REPLY • link 3.0 years ago by GenoMax 141k

0

Entering edit mode

My hairpin file doesn't contain U bases at all

Thank you

ADD REPLY • link 3.0 years ago by LynxLynx • 0

0

Entering edit mode

Great. I am not sure what kit was used for your dataset but some kits directly ligate a special adapter to miRNA. So unless that adapter is present the read likely does not represent a miRNA. Checking to see if this applies in your case.