FASTA of translated amino acid sequences in their six reading frames, which one is the optimal?
0
1
Entering edit mode
3.4 years ago
Luis999 ▴ 20

Hello, I'm sorry if my question is very basic, the truth is that until entering the PHD I had not had contact with programming languages other than HTML haha. I have a FASTA file that I create with the help of EMBOSS by translating the 6 reading frames of another nucleotide fasta file of a transncriptome, but now I need to filter those six frames to know which is the optimal one. I'm researching how to iterate in Python (yes, seriously, I'm just getting started on this) to see if that can help. Does anyone know if I am doing well or if there is any other more effective method? I'd appreciate any advice you can give me no matter what language or packages I need to install, I'm a bit desperate hahaha. Thank you and sorry for the inconvenience.

aminoacid protein peptide Reading optimal frames • 2.3k views
ADD COMMENT
2
Entering edit mode

What do you mean "optimal"?

ADD REPLY
0
Entering edit mode

Wow, I didn't think I would get answers so fast, thank you very much !!! By "Optimal" I meant the longest sequence

ADD REPLY
2
Entering edit mode

if with 'optimal' you mean the frame which leads to a protein sequence you will need to run some kind of ORF finder on it. (or take the one with the longest ORF if it may be a bit crude)

ADD REPLY
1
Entering edit mode

For ORF-finding, OP should try Borf if they have stranded data, but one can always use TransDecoder.

ADD REPLY
1
Entering edit mode

some examples indeed. many more are around as well: FrameD, est2orf, ORFfinder, .... they all likely perform somewhat equally (some might have extra features such as frame-shift correction (FrameD) ... )

ADD REPLY
0
Entering edit mode

Oh, thank you very much lieven, I going to check thos too

ADD REPLY
0
Entering edit mode

Oooh, thank you thank you thank you, I'm already reading the documentation for the programs you recommended. Borf looks promising. Another noob question that is still not clear to me, sorry. Does this type of program translate the 6 reading frames and the ORF that they return is the longest of those or does it only translate the first frame and that's it?

ADD REPLY
0
Entering edit mode

depends a bit on the software used.

In most cases they will report the longest ORF (and would have thus evaluated all possible 6 frames indeed) as that one is likely the "correct protein". This is however not always correct and that is why some programs will use something called 'coding potential' and will thus look for the reading frame with a substantial ORF and has a high coding potential and will then report that one (which is not necessarily the longest).

If you want the longest whatsoever do make sure you use a program that does that (longestORF for instance)

ADD REPLY
0
Entering edit mode

I have to dive deep in that documentation. Thank you so much Lieven, really appreciate your help :)

ADD REPLY
1
Entering edit mode

Again, I didn't think I would get answers so fast, for real, thank you very much!!! Yes, I was referring to the longest sequence, after searching for a while more I also realized that, as you say, there were ORF search packages

ADD REPLY

Login before adding your answer.

Traffic: 1853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6