Is there a method or HMM set to quickly determine if a sequence has a tRNA sequence for a large number of sequences/reads?
1
0
Entering edit mode
4.4 years ago
O.rka ▴ 720

I am looking at over 10 million reads in the form of fasta. I want to determine if each sequence has a portion of a tRNA sequence. I've been using tRNAscan-SE but this takes a very long time for a large number of reads since it was designed for contigs.

Scanning for HMMs is very quick. Is there a tool that does this or a HMM set that I can use to say "read_X has or does not have a tRNA hit"

I looked in the source code of tRNAscan-SE and didn't notice any HMMs.

trna sequence • 851 views
ADD COMMENT
2
Entering edit mode
4.4 years ago
Mensur Dlakic ★ 27k

tRNAscan-SE uses covariance models, which one can think of as HMMs that simultaneously model and score both primary sequence and secondary structure. This is why tRNAscan-SE is slow, but that's the price of extra calculation to get maximum sensitivity.

The fact that tRNA requires base-pairing between relatively distant portions of the sequence means that any tool scanning short sequences will miss secondary structure interactions. Still, there seems to be one:

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077596

It includes a covariance model for tRNA, but be warned that it will not be speedy. I suspect that any increase in speed will be at the expense of specificity.

ADD COMMENT

Login before adding your answer.

Traffic: 1262 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6