Question: microRNA-seq sequence lenght distribution after trimming
gravatar for szabo.marton
3 months ago by
szabo.marton0 wrote:

Hi! I'm doing a microRNA-seq analysis. After trimming and checking the results in FastQC the sequence length distribution panel shows two peaks at 24 nt and 36 nt. Here is my trimming command: cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -e 0.1 -O 5 -o H28_trim_test3.fastq H28.fastq

after aligment with mirdeep2 (with the help of the + modules) I've got 0.62/0.38 mapped/unmapped ratio.

I'm just wondering if this aligment ratio is acceptable for mammalian microRNAs or should I improve the mapped ratio? Also if I can abolish somehow the 36 nt peak will that improve the mapping ratio? I think yes, but I can't figure out the way yet.

rna-seq alignment • 168 views
ADD COMMENTlink written 3 months ago by szabo.marton0

This alignment rate is common. For what I remember, yes, the mapping rate of the 24bp is higher than that of the 36bp, but also the composition: 24bp corresponds mainly to miRNA, while 36bp correspond to piRNA.

You could filter out the longer reads from the sam / bam file (possibly using samtools view and then awk, there are solutions to similar problems here on BioStars), or you could filter the fastq and map all over again. But I wouldn't worry about them.

P.S.: probably from the BBMap/BBTools package can filter by length both sam / bam and fastq.

ADD REPLYlink written 3 months ago by h.mon24k

Thanks your advice! I'll check about that for details. That would be nice if we had sequenced piRNAs as well, but in my case the read lenghts are only 36 nt, so I guess the chance for full or at least reliable piRNA sequences is very low. Also the lab guys performed a size selection for 24 nt before performing the sequencing.

ADD REPLYlink written 3 months ago by szabo.marton0

What is the read length, which kit was used and are you sure this adapter is correct? This is the standard TruSeq, not the standard smallRNA adapter if I am not wrong. Please give some details.

ADD REPLYlink modified 3 months ago • written 3 months ago by ATpoint15k

Thanks your answer! The read length was 36 nt. I was told to use Trimmomatic for discarding adapter sequences, because it contains the adapter sequences, however it did nothing (maybe I did something wrong). Therefore I searched for the TruSeq adapters and I found that adapter sequence. About the kit, I don't know what were used exactly, however the sequencing was performed at Illumina platform with MiSeq reagent kit + NEBNext products were used for the RNA preparation.

ADD REPLYlink written 3 months ago by szabo.marton0

smallRNA preps generally require special handling of the data (mainly trimming of the adapter). Check the documentation for the exact kit used for instructions on how to handle the data.

ADD REPLYlink written 3 months ago by genomax65k

I checked the kit, we have used NEBNext Adaptors and Primers for Illumina (NEB #E7300), here are the ducomentation with the adapter sequences: Anyway I'm still confused what sequence should I trimm.

ADD REPLYlink written 3 months ago by szabo.marton0

Depending on how this kit works (e.g. direct adapter ligation to RNA) you may want to retain only those reads that have the adpater noted by @ATPoint and then trim the adapter off to get your RNA.Those would be the reads of interest. You can do this easily with from BBMap.

ADD REPLYlink modified 3 months ago • written 3 months ago by genomax65k

As this is smallRNA, I would go for the smallRNA adapter sequence. Use TGGAATTCTCGGGTGCCAAGG. Check with fastqc towards adapter content.

ADD REPLYlink written 3 months ago by ATpoint15k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1107 users visited in the last hour