Question: rRNA in human
1
gravatar for archie
20 months ago by
archie70
India
archie70 wrote:

I am working on human RNAseq data. I am trying to build rRNA database in order to remove the rRNA contamination. I explored various databases such as Silva rRNA database, UCSC browser ( to get rRNA gene_type), ensemble biomart. In Silva database, I found following count of rRNA sequences

LUS128 databset

Silva - 3198 Silva ref - none EMBL - 104 RDP - none

SSU128 dataset

Silva - 2662 (human + other organisms) silva Ref - 1999 (human + other) Silva Ref NR - 353 (human _ref) (NR must defines non redundant dataset). Greengenes - none RDP - none

I downloaded all these dataset and end up with approx 1500 sequence (removed duplicated sequence)

On the other hand, from UCSC browser , I found list of approx 560 rRNA sequence.

Can anybody suggest me which set I should consider for next step i.e sortmeRNA database construction in order to remove rRNA contamination from human RNAseq).

I will appreciate all suggestions.

rna-seq • 1.2k views
ADD COMMENTlink modified 20 months ago by genomax65k • written 20 months ago by archie70
1

Also see this thread http://seqanswers.com/forums/showthread.php?t=41868

You maybe able to get rRNA from GENCODE too (biotype = rRNA) https://www.gencodegenes.org/gencode_biotypes.html

ADD REPLYlink written 20 months ago by Santosh Anand4.7k

Hello santosh Anand,

I worked on Rfam database and found following entries of rRNA 5s - 615 (human filteration using RF00001 expert_db:"Rfam" AND TAXONOMY:"9606" AND rna_type:"rRNA" LSU and 5.8 - 707 (human filtration using RF02543 expert_db:"Rfam" AND TAXONOMY:"9606" AND rna_type:"rRNA") SSU - 558 (human filteration using RF01960 expert_db:"Rfam" AND TAXONOMY:"9606" AND rna_type:"rRNA") tRNA - 994 (human filteration using RF00005 expert_db:"Rfam" AND TAXONOMY:"9606" AND rna_type:"tRNA")

I will merge all these fasta files and create database to remove the rRNA and tRNA contamination from RNAseq reads. But I have one more doubt, In GtRNAdb available at http://gtrnadb.ucsc.edu/, count of tRNA of human dataset is 610. Now why this difference in tRNA count ? Now, which step will be good choice 1. selection of RFAM rRNA dataset ? 2. selection of rRNA + tRNA present in gtf file ? As many studies have reported the use of Rfam database for such analysis.

Thanks in advance

ADD REPLYlink written 20 months ago by archie70

We don't know what the aim of your analysis is, but chances are you don't have to remove the rRNA at all.

ADD REPLYlink written 20 months ago by WouterDeCoster38k
0
gravatar for genomax
20 months ago by
genomax65k
United States
genomax65k wrote:

See C: Removing rRNA and tRNA sequences using GTF files for human rDNA repeat. You can use it with BBsplit from BBMap suite (A: Tool to separate human and mouse ran seq reads ) to bin/remove ribosomal RNA reads.

ADD COMMENTlink modified 20 months ago • written 20 months ago by genomax65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 903 users visited in the last hour