How to count the number of reads aligned to rRNA
1
1
Entering edit mode
3.7 years ago
Javad ▴ 150

Dear all,

I am working on an organism that has 10 copies of rRNA genes on its genome. I would like to see the percentage of reads that are aligned to ribosomal RNA.

After alignment, rRNA reads align to every and each of these genes and if I perform the counting for reads that are not uniquely aligned to the genome, each one of these reads are counted 10 times. How can I count each read only once?

Any idea is highly appreciated.

Thanks

RNA-Seq • 2.2k views
ADD COMMENT
0
Entering edit mode

How can I count each read only once?

Extract reads mapped to the rRNA. Do a grep "^@SEQUENCER_ID" | sort | unique to get the unique Fastq ID's. Find the % comparing to total number of reads.

Not sure what aligner you used but if you want to align a multi-mapping read only once then you can use bbmap.sh option ambig=random. This will select one top scoring site randomly allowing you to count a read only one time.

ambiguous=best          (ambig) Set behavior on ambiguously-mapped reads (with
                        multiple top-scoring mapping locations).
                        best    (use the first best site)
                        toss    (consider unmapped)
                        random  (select one top-scoring site randomly)
                        all     (retain all top-scoring sites)
ADD REPLY
2
Entering edit mode
3.7 years ago
jkkbuddika ▴ 190

Hi, I am not sure whether this is the option that works best for you. But, I suppose you can get what you want using Tagdust2. Basically you provide the sequences of rRNA genes (I usually generate this using BioMart). I have used this in the past to remove rRNA reads before genome alignment. Run Tagdust2 and then take a look at the TagDust2 output summary file. If you do this following math you will get the number you want.

rRNA reads = total input reads - successfully extracted

When you do this, these rRNA reads will be saved on to a separate .fq file which you can use for any other downstream analyses you want to perform. May be an advantage for you.

To run Tagdust2:

tagdust -t 8 -1 R:N -o /path/to/output/ -ref /path/to/rRNA.txt -fe 2 /path/to/input.fastq

I hope this will help!

ADD COMMENT

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6