Question: How to count the number of reads aligned to rRNA
1
gravatar for Javad
7 months ago by
Javad130
Javad130 wrote:

Dear all,

I am working on an organism that has 10 copies of rRNA genes on its genome. I would like to see the percentage of reads that are aligned to ribosomal RNA.

After alignment, rRNA reads align to every and each of these genes and if I perform the counting for reads that are not uniquely aligned to the genome, each one of these reads are counted 10 times. How can I count each read only once?

Any idea is highly appreciated.

Thanks

rna-seq • 349 views
ADD COMMENTlink written 7 months ago by Javad130

How can I count each read only once?

Extract reads mapped to the rRNA. Do a grep "^@SEQUENCER_ID" | sort | unique to get the unique Fastq ID's. Find the % comparing to total number of reads.

Not sure what aligner you used but if you want to align a multi-mapping read only once then you can use bbmap.sh option ambig=random. This will select one top scoring site randomly allowing you to count a read only one time.

ambiguous=best          (ambig) Set behavior on ambiguously-mapped reads (with
                        multiple top-scoring mapping locations).
                        best    (use the first best site)
                        toss    (consider unmapped)
                        random  (select one top-scoring site randomly)
                        all     (retain all top-scoring sites)
ADD REPLYlink modified 7 months ago • written 7 months ago by GenoMax96k
1
gravatar for jkkbuddika
7 months ago by
jkkbuddika160
Bloomington, USA
jkkbuddika160 wrote:

Hi, I am not sure whether this is the option that works best for you. But, I suppose you can get what you want using Tagdust2. Basically you provide the sequences of rRNA genes (I usually generate this using BioMart). I have used this in the past to remove rRNA reads before genome alignment. Run Tagdust2 and then take a look at the TagDust2 output summary file. If you do this following math you will get the number you want.

rRNA reads = total input reads - successfully extracted

When you do this, these rRNA reads will be saved on to a separate .fq file which you can use for any other downstream analyses you want to perform. May be an advantage for you.

To run Tagdust2:

tagdust -t 8 -1 R:N -o /path/to/output/ -ref /path/to/rRNA.txt -fe 2 /path/to/input.fastq

I hope this will help!

ADD COMMENTlink modified 7 months ago • written 7 months ago by jkkbuddika160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1777 users visited in the last hour
_