HELP with NGS miRNA mapping: After clipping the adapter sequence, most of reads are removed
0
1
Entering edit mode
7.6 years ago
illinois.ks ▴ 200

Hello,

Previously, I have asked questions related with mapping the miRNAs. (miRNA mapping rate is very low.. (less than 0.03%))

Thank you David! :) Finally I could successfully map my miRNA reads.

But, this time I had another set of samples..but same design. 3 controls, VS 3 treated..

I followed exactly same logic.. ( since they are generated same machine.)

2. Remove index sequence

But, This time, I realized that after removing the adapter sequences, (TGGAATTCTCGGGTGCCAAGGAACTCCAGTCAC), I can see the file size are reduced dramatically, which means most of reads are removed.

For example, here is the fastq file size for original files

c1 (265M), c2 (428M), c3 (248 M), a1(268M), a2(344M), a3(443M)

c1 (132M, okay), c2( 15M, weird), c3(208M, okay), a1(153M, okay), a2(15M, weird), a3(18M, weird)

When I looked at the fastq files, I can see those files (e.g. c2, a2, a3,), many of reads are mostly composed of adapter sequences (I am not sure why it is... maybe experiments were bad? No idea about experiments). I guess that this is the reason that file are mostly chopped.

When, I try to further analysis (e.g. remove index sequence, first 4, last 4 removal for my case), I ran bowtie2.

Here is the result of bowtie2.

                       c1       c2      c3       a1       a2
mapping rate           55.88%   7.75%   70.06%   68.14%   27.48%
Total number of reads   1196717   135524  1841729  1367558  134217


I am wondering whether I can further process of this analysis. I heard that mapping rate should be usually around 50-80%. In my case, it is much less than that. Also the number of total reads are so less.

I need some comments for this. Is it the problem of experiments? OR what else? Can I further analyze this?

miRNA mapping RNA-seq • 2.2k views
2
Entering edit mode

Can you please provide two more things: 1) Number of reads before and after clipping and 2) the read length distribution.

Should be enough, if you just send it for e.g. c2.

0
Entering edit mode

Thank you david.

Yes I checked the read length distribution for c2.

Before the clipping the adapter sequence

3463306 36


After the clipping the adapter sequence.

 1049 15
1613 16
2055 17
2051 18
6626 19
5612 20
9208 21
4478 22
8396 23
5625 24
6632 25
8194 26
4760 27
7259 28
6845 29
6794 30
8149 31
40178 36


=================================

After chopping, I only have 135524 reads... (I sums up all)

====================================

FYI : Since I also need to clip the first 4 and last 4 index sequences from this, my read length distribution should be shifted 4 bps less.

0
Entering edit mode

I concluded that this experiment is somehow wrong.. in some reasons.. don't know why.. since for the side of analysis, there is nothing wrong. .to do.. ;(

We decided to do the experiments again!

1
Entering edit mode

For sanity check you can run FastQC and look at the adapter sequences plot, it should match the reduction in sequences.

0
Entering edit mode

I will also check with fastQC too. Thanks!