Question: Nucleotide to amino acid conversion of multiple sequences
0
gravatar for lokraj2003
4 weeks ago by
lokraj200370
lokraj200370 wrote:

I have 220 nucleotide sequences. They are around 7000 bp and each of them should translate into one protein of around 2100 amino acids. There are some functions already available in R that can convert nucleotide sequence to amino acid sequence. But, these available functions like Translate in seqinr or trans from ape package require start and end position. But my sequences don't have same start position. For exampe one sequence has start position at 650 and another at 720 and so on. When I am doing manually I use Expasy online tool which works pretty well. It there any way to translate all these sequence programmatically in R ? Is it possible to send my sequence to Expasy webpage using R studio and retrieve amino-acid sequences ? I am comfortable using R/Bioconductor but I can use Biopython too if there is a way to do using Biopython.

Thanks !

ADD COMMENTlink modified 4 weeks ago by Mensur Dlakic570 • written 4 weeks ago by lokraj200370

So, you have multiple sequences with different start points relative to one another? Do you need to perform a multiple sequence alignment before translation, or has this already been done?

ADD REPLYlink written 4 weeks ago by Brice Sarver2.8k

I am going to do selection pressure analysis. So, I will have to do multiple sequence alignment before I could actually do selection pressure analysis.

ADD REPLYlink written 4 weeks ago by lokraj200370
0
gravatar for swbarnes2
4 weeks ago by
swbarnes26.2k
United States
swbarnes26.2k wrote:

If you know what they should translate to, I'd use blastx. Then blastx will handle getting the frame right for you. You can give it a multi-fasta of input, then you have to parse the output.

ADD COMMENTlink written 4 weeks ago by swbarnes26.2k

Actually it works, but then parsing output became tedious. Thinking of a way to parse the output.

ADD REPLYlink written 28 days ago by lokraj200370
0
gravatar for Mensur Dlakic
4 weeks ago by
Mensur Dlakic570
USA
Mensur Dlakic570 wrote:

There is a nice set of biosequence conversion tools in easel. For your purpose, this command should do the trick:

esl-translate -l 2000 sequence.fna > sequence.faa

It specifically asks for ORFs larger than 2000 residues, which is presumably what you need. However, it can find ORFs in all 6 reading frames if needed.

ADD COMMENTlink written 4 weeks ago by Mensur Dlakic570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 825 users visited in the last hour