Question: Nucleotide to amino acid conversion of multiple sequences
0
gravatar for lokraj2003
12 months ago by
lokraj200390
lokraj200390 wrote:

I have 220 nucleotide sequences. They are around 7000 bp and each of them should translate into one protein of around 2100 amino acids. There are some functions already available in R that can convert nucleotide sequence to amino acid sequence. But, these available functions like Translate in seqinr or trans from ape package require start and end position. But my sequences don't have same start position. For exampe one sequence has start position at 650 and another at 720 and so on. When I am doing manually I use Expasy online tool which works pretty well. It there any way to translate all these sequence programmatically in R ? Is it possible to send my sequence to Expasy webpage using R studio and retrieve amino-acid sequences ? I am comfortable using R/Bioconductor but I can use Biopython too if there is a way to do using Biopython.

Thanks !

ADD COMMENTlink modified 12 months ago by Mensur Dlakic6.0k • written 12 months ago by lokraj200390

So, you have multiple sequences with different start points relative to one another? Do you need to perform a multiple sequence alignment before translation, or has this already been done?

ADD REPLYlink written 12 months ago by Brice Sarver3.5k

I am going to do selection pressure analysis. So, I will have to do multiple sequence alignment before I could actually do selection pressure analysis.

ADD REPLYlink written 12 months ago by lokraj200390
0
gravatar for swbarnes2
12 months ago by
swbarnes28.2k
United States
swbarnes28.2k wrote:

If you know what they should translate to, I'd use blastx. Then blastx will handle getting the frame right for you. You can give it a multi-fasta of input, then you have to parse the output.

ADD COMMENTlink written 12 months ago by swbarnes28.2k

Actually it works, but then parsing output became tedious. Thinking of a way to parse the output.

ADD REPLYlink written 12 months ago by lokraj200390
0
gravatar for Mensur Dlakic
12 months ago by
Mensur Dlakic6.0k
USA
Mensur Dlakic6.0k wrote:

There is a nice set of biosequence conversion tools in easel. For your purpose, this command should do the trick:

esl-translate -l 2000 sequence.fna > sequence.faa

It specifically asks for ORFs larger than 2000 residues, which is presumably what you need. However, it can find ORFs in all 6 reading frames if needed.

ADD COMMENTlink written 12 months ago by Mensur Dlakic6.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1542 users visited in the last hour