Question

HELP with NGS miRNA mapping: After clipping the adapter sequence, most of reads are removed

1

Entering edit mode

8.8 years ago

illinois.ks ▴ 210

Hello,

Previously, I have asked questions related with mapping the miRNAs. (miRNA mapping rate is very low.. (less than 0.03%))

Thank you David! :) Finally I could successfully map my miRNA reads.

But, this time I had another set of samples..but same design. 3 controls, VS 3 treated..

I followed exactly same logic.. ( since they are generated same machine.)

Remove adapter sequence
Remove index sequence

But, This time, I realized that after removing the adapter sequences, (TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC), I can see the file size are reduced dramatically, which means most of reads are removed.

For example, here is the fastq file size for original files

c1 (265M), c2 (428M), c3 (248 M), a1(268M), a2(344M), a3(443M)

after removing the adapter sequences

c1 (132M, okay), c2( 15M, weird), c3(208M, okay), a1(153M, okay), a2(15M, weird), a3(18M, weird)

When I looked at the fastq files, I can see those files (e.g. c2, a2, a3,), many of reads are mostly composed of adapter sequences (I am not sure why it is... maybe experiments were bad? No idea about experiments). I guess that this is the reason that file are mostly chopped.

When, I try to further analysis (e.g. remove index sequence, first 4, last 4 removal for my case), I ran bowtie2.

Here is the result of bowtie2.

                       c1       c2      c3       a1       a2
mapping rate           55.88%   7.75%   70.06%   68.14%   27.48%
Total number of reads   1196717   135524  1841729  1367558  134217

I am wondering whether I can further process of this analysis. I heard that mapping rate should be usually around 50-80%. In my case, it is much less than that. Also the number of total reads are so less.

I need some comments for this. Is it the problem of experiments? OR what else? Can I further analyze this?

miRNA mapping RNA-seq • 2.6k views

ADD COMMENT • link updated 16 months ago by Ram 43k • written 8.8 years ago by illinois.ks ▴ 210

2

Entering edit mode

Can you please provide two more things: 1) Number of reads before and after clipping and 2) the read length distribution.

Should be enough, if you just send it for e.g. c2.

ADD REPLY • link 8.8 years ago by David Langenberger 11k

0

Entering edit mode

Thank you david.

Yes I checked the read length distribution for c2.

Before the clipping the adapter sequence

3463306 36

After the clipping the adapter sequence.

=================================

It seems that originally I had 3 millions reads..with 36 bps.

After chopping, I only have 135524 reads... (I sums up all)

====================================

FYI : Since I also need to clip the first 4 and last 4 index sequences from this, my read length distribution should be shifted 4 bps less.

ADD REPLY • link updated 17 months ago by Ram 43k • written 8.8 years ago by illinois.ks ▴ 210

0

Entering edit mode

I concluded that this experiment is somehow wrong.. in some reasons.. don't know why.. since for the side of analysis, there is nothing wrong. .to do.. ;(

We decided to do the experiments again!

ADD REPLY • link updated 17 months ago by Ram 43k • written 8.8 years ago by illinois.ks ▴ 210

1

Entering edit mode

For sanity check you can run FastQC and look at the adapter sequences plot, it should match the reduction in sequences.

ADD REPLY • link 8.8 years ago by Asaf 10k

0

Entering edit mode

I will also check with fastQC too. Thanks!

ADD REPLY • link updated 17 months ago by Ram 43k • written 8.8 years ago by illinois.ks ▴ 210