Translation Of Est Sequences To Protein
5
6
Entering edit mode
13.0 years ago

Hi all I have a set of EST sequences and I want to translate these to protein to find protein domains for them. First I have to find the correct frame. In other words, I need the correct protein sequences of the EST sequences. Now my question is, is there any software to do this? I really need such software. Thanks a lot in advance

MRB, (mbakhtiari@ut.ac.ir)

est translation • 7.1k views
ADD COMMENT
5
Entering edit mode
13.0 years ago
Michael 54k

In the EMBOSS tools there is transeq that will easily translate in all 6 reading frames with option -frame=6.

GetORF will find open reading frames in the sequence and can give you the translation. Both apps can be locally installed.

The resulting AA-sequence in 6 frame tranlation can then be searched against e.g. PFAM for conserved protein domains.

So far the technical aspect. With respect to determining the correct protein sequence: That can only mean determining the correct reading frame, but this is not so easy from the sequence data alone. You will likely find multiple ORFs in the ESTs, while some frames might also lack stop codons, and some ORFs might overlap on different reading frames, or the ORFs might be incomplete. That is the reason
why one would normally search translations in all 6 reading frames. Those reading frames exhibiting good hits are likely to be the correct ones.

Good luck!

ADD COMMENT
1
Entering edit mode

Thanks so much for your perfect answer. really, i am searching for a software that analyze output blastx and est-sequences and bring a correct protein sequence. in other words, first we do blastx and results of blastx and est sequences submit to software and at the end software bring correct sequence of the ests. i already had seen such software, but now i cant download it (its name is OrfPredictor). any way thanks. best regards

ADD REPLY
5
Entering edit mode
13.0 years ago
Woa ★ 2.9k

For getting the correct protein sequence (eliminating frameshift errors)use ESTSCAN2 with an appropriate model. There are several others including Prot4EST. Contact the OrfPredictor author for local version.Dr. X. (Jack) Min will most likely provide it if that'll be used for academic research.

ADD COMMENT
0
Entering edit mode

And read this discussion.

ADD REPLY
0
Entering edit mode

Thanks so much for your perfect answer.

ADD REPLY
0
Entering edit mode

unfortunately website of OrfPredictor is corrupted and also i cant find email of Dr. X. (Jack) (his email in the OrfPredictor paper is closed). I think that OrfPredictor can solve my problem but i cant find it !!!! can you guide me plz.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

OrfPredictor needs BlASTX output to works correctly. I always works with blast2go software to do blastx (for many sequences) and it doesnt bring us BlASTX output. how can i obtain blastx output for my sequences to work with OrfPredictor. or in other words, is there any software bring blastx output? regards

ADD REPLY
2
Entering edit mode
13.0 years ago

Assuming you also want to know what proteins they are the easiest way is probably to run a Blastx, that will do a 6 frame translation of your sequence and get you the protein hits for each of these. You can for instance run Blastx [?]at NCBI[?].

But if you really only want to do a translation you can run simpler tools, like [?]the ExPASy translate tool[?].

If you indeed need to translate multiple sequences the EMBOSS translate tool might be what you are looking for. You can [?]use it at EBI[?], it accepts multiple sequences either copied in or uploaded as a file. (Sorry that I left this out initially, I thought the ExPASy tool would also accept multiple sequences, but I haven't actually tried).

ADD COMMENT
0
Entering edit mode

Thanks so much for your answer. i did blasx but i just need protein sequences of ESTs, not proteins related to them and also they are so much that i cant handle manually (by the ExPASy translate tool or some tools like this), by this reason i am searching a good software to do this. any way thanks may the joy be with you

ADD REPLY
0
Entering edit mode

Sorry, I thought the ExPASy tool would accept multiple sequences, but haven't really tried. In any case the EMBOSS tool does.

ADD REPLY
0
Entering edit mode
13.0 years ago
Jack Min ▴ 10

ORFPredictor website is at http://proteomics.ysu.edu/tools/OrfPredictor.html

Blastx output is optional - without it, the output is as good as having it.

Jack MIn

ADD COMMENT
0
Entering edit mode
13.0 years ago

Good answers above. Let me add a few points that were not mentioned. One, many EST sequences contain no translation because they are composed only of UTR (untranslated sequence, typically from a long 3'-UTR). Thus, some of the tools mentioned will work, but may give you no match to proteins and that is fine. Two, be aware that many older EST entries could have the cloning site and even some cloning vector in the data entry. I have seen this myself and it is something that affects your translation such that a BLASTX hit may begin at query position 11. Three, EST seqs are about 97-99.5% accurate - or 0.5 to 3% errors. These take the form of frame shifts (which can lower your BLASTX HSP score) or unmatched residues in the HSP (Blast result) or even gaps.

Keep this in mind regardless of the tool you use.

ADD COMMENT
0
Entering edit mode

Thanks a lot for your guides. with best regards

ADD REPLY

Login before adding your answer.

Traffic: 1880 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6