Length of miRNA reads after preprocessing steps
3
2
Entering edit mode
7.1 years ago
Maguelonne ▴ 20

Hi all,

miRNA analysis is new for me. I'm working on reads from a 55 cycle single-read sequencing run and I think I have a problem.. After pre processing steps (removing low quality reads, removing 5' and 3' adapters), 50% of the reads are 55 nucleotides long (that's mean around 1000000 on a total of 2000000).

I understand that these reads should be removed, as they can't be miRNA, but is it normal that length filtering implies removing such a number of reads? and to what could these reads correspond?

Thanks in advance for your help!

MR

miRNA seq RNA-Seq preprocessing length filtering • 5.1k views
ADD COMMENT
0
Entering edit mode

which adapter trimming did you use ?

ADD REPLY
0
Entering edit mode

cutadapt  (and adapter sequences which appear as over-expressed sequences in the FastQC results disappear after adapter trimming meaning it went well, right?)

ADD REPLY
3
Entering edit mode
7.1 years ago
Manvendra Singh ★ 2.2k

I think the little similar question was posted here

C: miRNA seq trimming

where Ryan had given the reason,

I had suggested, in your case , I think it should be ...

##### remove the  adapter

cutadapt --discard-untrimmed --minimum-length=20 --maximum-length=30 -a <adapter_sequence> In_seq.fastq > your_trimmed_file.fastq

####### download ribosomal and tRNA sequence and build its index

###### then remove also the reads mapping to ribosomal and tRNA sequences 

bowtie --seedlen=23 --un output_file.fastq /path_to/bowtieindex/r_tRNA your_trimmed_file.fastq > /dev/null

your output_file.fastq should look better to allign.

HTH

ADD COMMENT
0
Entering edit mode

Hi Manvendra, why should we consider upto 30bp as the maximum length when the maximum length of a mature miRNA is about 24bp? Thanks, Robert

ADD REPLY
0
Entering edit mode
7.1 years ago
seta ★ 1.5k

Hi, if you did miRNA sequencing, you have to have about sequences with 20-23 bp in length after trimming, but you may have sequences with up to 35bp if you had small RNA-sequencing. those reads with unusual length (55bp) in your work can result from adaptor dimerization and have to remove, so during trimming, you can define a threshold that keep just sequences with 15-40 bp in length to get rid of unwanted sequences.

ADD COMMENT
0
Entering edit mode

Dear Seta I have some human non coding RNA-seq data.for getting diff-exp of miRNA,should I trim length between 18 to 30 befor starting? why Avg. Sequence length is 51,can I use these data for getting diff-exp of lncRNA too?

ADD REPLY
0
Entering edit mode
7.1 years ago
Maguelonne ▴ 20

I actually just found an answer: these 50% of reads correspond to phiX contamination! I didn't suspect that because I thought it was "rare" after demultiplexing, but it seems that it's a well known problem.

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 2523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6