Question: EMBOSS transeq translate to protein with 3 letter code
0
gravatar for marongiu.luigi
20 months ago by
Germany, Mannheim, UMM
marongiu.luigi510 wrote:

dear all,

would be possible to translate a nucleotide sequence with the three letter code using EMBOSS transeq? I can do it with the one letter code

$ echo atgtttcaggacccacaggagtaa | transeq -filter -osformat2 text
MFQDPQE*

But I don't see in the manual the 3 letter code.

Thank you

ADD COMMENTlink written 20 months ago by marongiu.luigi510

If you don't see that option in the manual then no. You could do some replacements with sed if you must have three letter code.

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax83k

Any particular reason why you want to do that? It will only be very confusing ...

ADD REPLYlink written 20 months ago by lieven.sterck7.8k
1

Just for graphical reasons: with the three letter code, it is easier to see the correspondence with the triplette:

atgttt...
MetPhe...
ADD REPLYlink written 20 months ago by marongiu.luigi510
1

I find this even easier:

atgttt
 M  F
ADD REPLYlink modified 20 months ago • written 20 months ago by h.mon29k

yes but you need to add 5 spaces because the sequence is given as MF not ad _M__F_

ADD REPLYlink written 20 months ago by marongiu.luigi510
1

Not quite right, the general pattern is you have to insert one initial space, then two spaces between every amino acid, then a final space. There are several tricks around to split a string into characters. As I like perl, split //, $_ would split a string at every character, then join with join. The split PerlDoc has some examples of using them together.

ADD REPLYlink written 20 months ago by h.mon29k

It wouldn't be too hard to write a script which converts between one and three letter codes. There must be ample python examples to get you started.

ADD REPLYlink written 20 months ago by WouterDeCoster43k

sure, that is not the problem, just wanted to know if transeq does it directly to save the effort...

ADD REPLYlink written 20 months ago by marongiu.luigi510
3
gravatar for cpad0112
20 months ago by
cpad011213k
India
cpad011213k wrote:

The output peptide sequence is always in the standard one-letter IUPAC code.

http://structure.usc.edu/emboss/transeq.html

and try this:

$ echo atgtttcaggacccacaggagtaa | showseq -filter -threeletter y -format 4

           10        20        
  ----:----|----:----|----
  atgtttcaggacccacaggagtaa

  MetPheGlnAspProGlnGlu***
ADD COMMENTlink modified 20 months ago • written 20 months ago by cpad011213k

that is exactly what I have been looking for! thank you

ADD REPLYlink written 20 months ago by marongiu.luigi510

I have moved the comment of cpad0112 to an answer so it can be accepted.

ADD REPLYlink written 20 months ago by WouterDeCoster43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2088 users visited in the last hour