if i am right to remove the contamination?
3
0
Entering edit mode
6.4 years ago
A ★ 4.0k

hello all,

i downloaded the fasta file containing rRNA-genes in my interest organism, made genome indexing with rRNA-genes as reference and mapped the timmed-fastq on indexed genome then i indexed genome by coding-genes sequence and mapped the unmapped reads resulted from the previous step on newly indexed genome...

do you think if i am waisting time and there another way to get rid of rRNA contamination????

thank you

myposts ribo-seq rRNA • 2.7k views
5
Entering edit mode
6.4 years ago
seta ★ 1.5k

Try SortMerna tool, it is definitely easier than yours.

0
Entering edit mode
6.4 years ago

If you want to know % of rRNA contamination, one approach would be to include everything in the reference and later count and remove the reads mapping to rRNA from SAM/BAM file.

If you just want to get rid of rRNA reads, just don't include the rRNA in the reference genome. They will remain as unmapped.

0
Entering edit mode

thank you...

0
Entering edit mode

Can you please elaborate more on how to calculate contamination? I have a file that shows the intersect between TSS CAGE data and rRNA overlaps. I am not sure how to measure rRNA contamination from this. I would like to do so in R.

0
Entering edit mode

show the first few lines of the file.

0
Entering edit mode
      V1        V2        V3                         V4 V5 V6
1  chr1 108113121 108113122 chr1:108113121-108113122,-  3  -
2  chr1 108113470 108113471 chr1:108113470-108113471,-  1  -
3  chr1 237766677 237766678 chr1:237766677-237766678,+  1  +
4  chr1  91853110  91853111   chr1:91853110-91853111,-  1  -

0
Entering edit mode
BT2/bowtie2 -N 0 -L 15 -x rRNA --un SRR1211041_trimmed_unmapped.fastq -U SRR1211041_trimmed.fastq -S mapped_and_unmapped.sam  using above command first I mapped the reads on rRNA then I will have SRR1211041_trimmed.fastq which I aligned with indexed genome by coding-gene sequence using this syntax BT2/bowtie2 -x rRNA -U SRR1211041_trimmed_unmapped.fastq -S my.sam

0
Entering edit mode

Hello Goutham, I am new to bioinformatics field. I want to know % of reads matching rRNA genes. Can you please give me the steps involved in it and how to do it?

Also can I get % of reads from any specific gene as well?

0
Entering edit mode

for % of reads matching rRNA genes, u first indecize the whole rRNA.fasta (you can get this fasta file from ensembl) then mapped your reads against them and from result you can find the percent of reads mapped on the rRNA genes

0
Entering edit mode

Dear Fereshteh, below is the bowtie2 output I received after running the following command.

What is aligned concordantly 0,1, >1 times?

0.12% overall alignment rate - Does this 0.12% refer to percentage of reads mapping to rRNA genes?

Frank\$ bowtie2 -N 0 -L 15 -x rRNA_genes -1 Project/Sample/DH558-1_GTGGCC_L005_R1.all.fastq.gz -2 Project/Sample/DH558-1_GTGGCC_L005_R2.all.fastq.gz -S Project/DH558-1.sam

66113117 (100.00%) were paired; of these:
66061268 (99.92%) aligned concordantly 0 times
58 (0.00%) aligned concordantly exactly 1 time
51791 (0.08%) aligned concordantly >1 times
----
66061268 pairs aligned concordantly 0 times; of these:
13 (0.00%) aligned discordantly 1 time
----
66061255 pairs aligned 0 times concordantly or discordantly; of these:
132122510 mates make up the pairs; of these:
132068318 (99.96%) aligned 0 times
13629 (0.01%) aligned exactly 1 time
40563 (0.03%) aligned >1 times
0.12% overall alignment rate

0
Entering edit mode

you know Frank, actually me also new in NGS but i think you right, totally 0.12% of the reads have been mapped on the rRNA genes... I think 0, 1,.. times (concordantly maybe means all these reads harmoniously) means that if a read has been mapped only once or twice, etc...something like multimapping that is more common in eukaryotes because of repetition in genome, introns...anyway short reads tend to be mapped on some other places in the genome especially in eukaryotes.

1
Entering edit mode

Thanks Fereshteh for your suggestions. I have created a new post based on this to confirm our understanding.

Hopefully somebody confirms it :)

0
Entering edit mode
6.3 years ago

Thanks Fereshteh for your help. Currently, I am running the bowtie2 alignment between my fastq reads and rRNA gene index. Once it is completed, will go through the output and get back to you.

0
Entering edit mode

great job Frank