Question: To know % of reads matching rRNA genes in my sample
0
gravatar for frank1987lee1987
3.7 years ago by
United States
frank1987lee198710 wrote:

Dear All,

I want to know % of reads matching to rRNA genes. So I have been suggested to download rRNA genes from ensembl. Then I have created the index files for rRNA fasta file. Finally using bowtie2, I have aligned my reads in fastq format against these rRNA index files. Below is the bowtie2 output I received after running the following command.

Frank$ bowtie2 -N 0 -L 15 -x rRNA_genes -1 Project/Sample/DH558-1_GTGGCC_L005_R1.all.fastq.gz -2 Project/Sample/DH558-1_GTGGCC_L005_R2.all.fastq.gz -S Project/DH558-1.sam

66113117 reads; of these:

  66113117 (100.00%) were paired; of these:

    66061268 (99.92%) aligned concordantly 0 times

    58 (0.00%) aligned concordantly exactly 1 time

    51791 (0.08%) aligned concordantly >1 times

    ----

    66061268 pairs aligned concordantly 0 times; of these:

      13 (0.00%) aligned discordantly 1 time

    ----

    66061255 pairs aligned 0 times concordantly or discordantly; of these:

      132122510 mates make up the pairs; of these:

        132068318 (99.96%) aligned 0 times

        13629 (0.01%) aligned exactly 1 time

        40563 (0.03%) aligned >1 times

0.12% overall alignment rate

-------------------------------------------------------------------------------------------------------------------------------------------------------

What does it mean - aligned concordantly 0,1, >1 times?

0.12% overall alignment rate - Does this 0.12% refer to percentage of reads mapping to rRNA genes?

rrna genes rna-seq bowtie2 • 1.6k views
ADD COMMENTlink modified 3.7 years ago by michael.ante3.3k • written 3.7 years ago by frank1987lee198710
0
gravatar for Devon Ryan
3.7 years ago by
Devon Ryan90k
Freiburg, Germany
Devon Ryan90k wrote:

Yes, 0.12% is the percent that aligned to rRNA.

For your other question just search the site for "align concordant discordant" and you'll get plenty of posts talking about what that means.

ADD COMMENTlink written 3.7 years ago by Devon Ryan90k

Thanks Devon for confirming it. I believe, the same approach can be applied to know percentage of reads associated to any X gene. Instead of rRNA gene fasta sequence, I need to take fasta sequence of my gene of interest, first to build index and then align to reads. 

ADD REPLYlink written 3.7 years ago by frank1987lee198710

It would be better to align against the whole genome and then count alignments to your gene. By restricting what you align to you end up increasing the false-positive alignment rate (though the increase may be insignificant in many cases).

ADD REPLYlink written 3.7 years ago by Devon Ryan90k

Thanks Davon, yeah that need to be considered.
 

ADD REPLYlink written 3.7 years ago by frank1987lee198710
0
gravatar for michael.ante
3.7 years ago by
michael.ante3.3k
Austria/Vienna
michael.ante3.3k wrote:

Hi Frank,

I would additionally map against the NCBI annotation; I see sometimes a bit more there.
And don't forget about the mitochondrial rRNA; it is separately annotated in ENSEMBL and it has a short polyA-tail.

Cheers,

Michael

ADD COMMENTlink written 3.7 years ago by michael.ante3.3k

Thanks Michael. Do you mean (NCBI annotation) comparing with GTF file?
 

ADD REPLYlink written 3.7 years ago by frank1987lee198710

I mean to download the twenty-something rRNA sequences from Nuccore (don't forget the two mt-rRNAs) as FASTA files and repeat the mapping you've done.

Regarding Devon's remark, please denote that reads from rRNAs are very often multi-mapping reads (they are one class in the repeat-masker model) and are under-represented while analysing the genome-mapped reads. On the one hand, they are discarded by the aligner due to too many mappings or they are not counted in e.g. htseq-count.

ADD REPLYlink written 3.7 years ago by michael.ante3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1683 users visited in the last hour