EMBOSS transeq lines with "(REVERSE SENSE)" in not reverse sence translations(?)
1
0
Entering edit mode
12 days ago
Luis999 ▴ 20

Hi good day. Sorry, I have a silly doubt, if someone could help me I would really appreciate it. You see, I'm using EMBOSS's transeq tool (something new for me) to translate an ORF file. When translating the 6 different reading frames and wanting to find which would be the longestORF and therefore the one indicated to use in later steps. But it turns out that in the translation files, there are lines with the legend "(REVERSE SENSE)" even when using the command to indicate that I wanted to translate the reading frame 1 (which I understand is the one that starts from the first amino acid and this it is not read in reverse) and in the files resulting from using the commands to obtain the frames in reverse (4, 5 and 6) there are both, lines with "(REVERSE SENSE)" and with absence of this legend.

This is how the content of translation archives looks like

>Something1_1_17_6 [756 - 785]
ILRRTLPTFL
>Something1_1_18_6 [789 - 818]
*FVILLQCTX
>Something1_1_19_6 [737 - 826]
LFTDL*FFCNVHIRF*DGLFQHFFI*IFNX
>Something1_1_20_6 [814 - 740] (REVERSE SENSE)
*RFR*RNVGRVRLKI*CVHCRRITX
>Something1_1_21_6 [732 - 685] (REVERSE SENSE)
HWMRRML*I*AAI*PV
>Something1_1_22_6 [736 - 671] (REVERSE SENSE)
HG*SNIG*GGCCEYELPYDQFX


To generate it I used the comand transeq imputfile outputfilename.pep -frame=-3 , which is supposed to generate the third reverse translation (frame 6)

Oh, if you think this is not the best way to find the longestORF and want to recommend something else, I would greatly appreciate any input. :D

Thank you very much for taking the time to read and thank you very much in advance to those who can answer my question. Have a nice day!

ORF transeq traduction EMBOSS • 222 views
0
Entering edit mode
12 days ago
Mensur Dlakic ★ 13k

I suggest a program called esl-translate from the HMMer package. It will search all reading frames, and you can specify whether your ORFs must start with Met, the minimum acceptable ORF length, etc. For each identified ORF it will print the length and the translation frame, so it should be easy to parse it and identify the longest ORF.

For this DNA sequence:

>test_sequence
CGGCGGCCGCGCCAGCCGGCGAGCGCCCCGTCCGCTACCTCGGCCTCTGCGCCGCCGCCGGCGACTCGCTCCGCAAATTA
CTGACCTTCTGGTTCGCCAAGATTGTGGGCCTCTTCCAAGCCGCCGCCGCCAAAGCCATCACCGCCAAGGCCGCGGAGGC
TGACAAGCGCTCCACGTCAAAGGTGCGGGCGGTGGTGCCGGAGGGCGCGGCGGGGAGCGTGCCTCGGCCGGGACAGCAGC
GCGGCGCCCCGCCGCCGCACCGGGCGGTCCCGGACATCGCCGCCTCGATCCCCGGGACGGACGCCGACATCGCAAACTTG
CGTTTGGTGCCCATGCCGCGCCGCTGGGGGTCGCGCGGGGAGGCGGTTGCGGACGTCGGCCGGCGCCGACCAAATCGCCC
CGCCTCCAGCCCTCTCTCCTCGCCCCAGACGCGCCGCACGCGGCCGGCGGGGGGGATCTGACCGTTTCCGACAGGGAGAA
GATTGGCATGGCCCCTGCAAACCCTGAGCACGACCCGGGCCGCGTCGTGCTCGCCATCAAGCGCACCTCCTTCCTCGGCG
CCATGGACTTTACGCCCGAGCTGGCCGCCATGCTGAGCGTGGACGTCCTGGTCACGCTTGACAAAGAGCCGGACGACGGC
CGCGCAGGGGGGCGGCTCGTTTACATTCCGCTCCGCGTGGTCGACGCCGACAGCCCCGACCACCCCGACCGCCGCTGGAT
GTGCACCGTCGTGATCAACGTGGGCATGGATCGCACCTATGGTCTACTTGTGGGCGAATGGAACGCTTGCTGCGTGGCTA
TGGGCGCGCGCGTGGGCGGGAAGGCCGAGTTTGGGCGCGCGGCCGGCGCGGCGCCTGCGGCGACCATCCGGTTCCTGCCG
CCTCCCTAGGCGCGGCGCTGGCGGCTGCTTTGCTTGTGTAGTGCATGATCCTGCATTTACCGGTGGGAGAGACGAGCGTG
GATGGGGGCAGAGGGTGGGGATAGTCGGCTAGGCGTGCGCGCGTGCCGTCGCCCTCAGCCCTTCAGACCACCCACTTCTT
TCCAGTCGCAGGCTGCTCCGCCTGGGTTCGAGCCAATTCGCGCTCTGGCGTGCGCCACTGGCGACATTGCCCCCACACTC
TCCCGCCGCGCCGCTGGCGCTGTGGCGGCCAACGAGACCCACGCCGAGGTGGCGGTCCTACCC


The longest ORF is:

>orf6 source=test_sequence coords=344..886 length=181 frame=2 desc=
MGVARGGGCGRRPAPTKSPRLQPSLLAPDAPHAAGGGDLTVSDREKIGMAPANPEHDPGR
PDRRWMCTVVINVGMDRTYGLLVGEWNACCVAMGARVGGKAEFGRAAGAAPAATIRFLPP
P