Question: EMBOSS transeq translate to protein with 3 letter code
0
gravatar for marongiu.luigi
10 months ago by
Germany, Mannheim, UMM
marongiu.luigi380 wrote:

dear all,

would be possible to translate a nucleotide sequence with the three letter code using EMBOSS transeq? I can do it with the one letter code

$ echo atgtttcaggacccacaggagtaa | transeq -filter -osformat2 text
MFQDPQE*

But I don't see in the manual the 3 letter code.

Thank you

ADD COMMENTlink written 10 months ago by marongiu.luigi380

If you don't see that option in the manual then no. You could do some replacements with sed if you must have three letter code.

ADD REPLYlink modified 10 months ago • written 10 months ago by genomax69k

Any particular reason why you want to do that? It will only be very confusing ...

ADD REPLYlink written 10 months ago by lieven.sterck5.5k
1

Just for graphical reasons: with the three letter code, it is easier to see the correspondence with the triplette:

atgttt...
MetPhe...
ADD REPLYlink written 10 months ago by marongiu.luigi380
1

I find this even easier:

atgttt
 M  F
ADD REPLYlink modified 10 months ago • written 10 months ago by h.mon26k

yes but you need to add 5 spaces because the sequence is given as MF not ad _M__F_

ADD REPLYlink written 10 months ago by marongiu.luigi380
1

Not quite right, the general pattern is you have to insert one initial space, then two spaces between every amino acid, then a final space. There are several tricks around to split a string into characters. As I like perl, split //, $_ would split a string at every character, then join with join. The split PerlDoc has some examples of using them together.

ADD REPLYlink written 10 months ago by h.mon26k

It wouldn't be too hard to write a script which converts between one and three letter codes. There must be ample python examples to get you started.

ADD REPLYlink written 10 months ago by WouterDeCoster40k

sure, that is not the problem, just wanted to know if transeq does it directly to save the effort...

ADD REPLYlink written 10 months ago by marongiu.luigi380
3
gravatar for cpad0112
10 months ago by
cpad011211k
India
cpad011211k wrote:

The output peptide sequence is always in the standard one-letter IUPAC code.

http://structure.usc.edu/emboss/transeq.html

and try this:

$ echo atgtttcaggacccacaggagtaa | showseq -filter -threeletter y -format 4

           10        20        
  ----:----|----:----|----
  atgtttcaggacccacaggagtaa

  MetPheGlnAspProGlnGlu***
ADD COMMENTlink modified 10 months ago • written 10 months ago by cpad011211k

that is exactly what I have been looking for! thank you

ADD REPLYlink written 10 months ago by marongiu.luigi380

I have moved the comment of cpad0112 to an answer so it can be accepted.

ADD REPLYlink written 10 months ago by WouterDeCoster40k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 863 users visited in the last hour