Question: ORF translation of a huge number of DNA sequences
0
gravatar for l.souza
2.0 years ago by
l.souza70
Brasilia, Brazil
l.souza70 wrote:

I'm looking for a software which I can translate the open reading frame of a lot of sequences at the same time. I've tried to use EMBOSS Sixpack, but even in the local software there's a limit of sequences to input. What can I use?

dna translation sequence orf • 1.0k views
ADD COMMENTlink modified 2.0 years ago by h.mon25k • written 2.0 years ago by l.souza70
2

Do you have to extract the ORF, or your dna sequences are ATG-STOP already? Eitherway, Biopython can make quick work of it. Also, see this post.

ADD REPLYlink written 2.0 years ago by st.ph.n2.5k

I need to extract the ORF! Can I do this with Biopython?

ADD REPLYlink written 2.0 years ago by l.souza70

You will need to know a little more about your sequences. Are the ORFs in the forward or reverse direction? If reverse, you'll need to take the reverse complement, and then find 'ATG' in a window search of 3 nucleotides. Do you happen to know the approximate length of the ORFs? You will have to then search for stop codons. With finding 'ATG', you can then translate in the forward direction from the first position.

ADD REPLYlink written 2.0 years ago by st.ph.n2.5k

They are in the forward direction. They have something about 7000nt.

ADD REPLYlink written 2.0 years ago by l.souza70

see Emboss sixpack replacement : download & install the standalone version.

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum120k
0
gravatar for h.mon
2.0 years ago by
h.mon25k
Brazil
h.mon25k wrote:

You can split your file into smaller files with GenomeTools (gt splitfasta -numfiles 60 seqs.fasta), faSplit by Jim Kent or fasta-splitter by Kirill Kryukov, then loop or parallel through the files.

ADD COMMENTlink written 2.0 years ago by h.mon25k

Sorry, but I don't understand why I should split my files into smaller ones...

ADD REPLYlink written 2.0 years ago by l.souza70

but even in the local software there's a limit of sequences to input

ADD REPLYlink written 2.0 years ago by h.mon25k

I mean a limit in the number of sequences.

ADD REPLYlink written 2.0 years ago by l.souza70
1

fasta-splitter and faSplit can split by number of sequences.

I thought gt could split by number of sequences, apparently it can't.

ADD REPLYlink written 2.0 years ago by h.mon25k

I'm gonna try it... But I would be glad if I'd find something to run all at once.

ADD REPLYlink written 2.0 years ago by l.souza70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1740 users visited in the last hour