I need to filter rRNA and tRNA from a mouse ribosomal profiling and RNA seq datasets. Am I right with the assumption that since for mouse there exists the "complete repeating unit of Mus musculus ribosomal DNA" as found here, (https://www.ncbi.nlm.nih.gov/nuccore/BK000964), I can simply download the FASTA file, make a bowtie index out of it and align it to it? I also found the Silva rRNA database https://www.arb-silva.de/download/arb-files/ . Does this include the same rRNA sequences that is also included in the repeating unit file from NCBI? Should I prefer any of them?
I would construct a bowtie index with:
$ bowtie-build species_rRNA species_rRNA.fa
and then exclude the mapping reads with:
$ bowtie -p4 -v3 species_rRNA my_reads.fastq \ --un my_reads_rRNA_unalign.fq >FileLocation
Then I would repeat the same for tRNA. I would get tRNAs from http://gtrnadb.ucsc.edu/genomes/eukaryota/Mmusc10/ under the FASTA link on the left side.
Would this be correct? Would I miss some sequences, especially some rRNA sequences, including rRNA sequences not annotated in the "complete repeating sequence" from NCBI or mitochondrial rRNA?
What about snoRNA or other RNA species. I found that most papers only excluded rRNA and sometimes tRNA but most of the time no other RNA species.
Thank you for your help!