How to choose the correct ncRNA sequence from a RNA sequence database (Rfam, ENA, RNAcentral, etc)
5.5 years ago
5.5 years ago
Nick • 0

EDIT: Sorry for asking, I think I got my answer, or at least the beginning of it. I should be looking at sequences on the same chromosome (for eukaryotes)

I want to align the 5S rRNA sequence of many different species, and to do so, I want to obtain the 5S rRNA sequences from a database. RNAcentral seems like a good choice since it aggregates the ncRNA sequences from many different databases such as Rfam and ENA. My problem is that when I type in a species, I get dozens, sometimes hundreds of 5S rRNA sequences and they can sometimes be vastly different.

For example, when I search "Pan troglodytes" I get the result: This page has 411 results. The 5S rRNA is usually around 120-130 nucleotides for most species, so I am pretty sure I can discount anything higher or lower than that number, but that still leaves me with dozens of sequences that are very different!

My question is: How can I choose the most 'accurate' sequence? Am I using the wrong database?

rna ncrna rnacentral ribosome rrna • 1.4k views

