Question: Need Help: How can I perform Emboss-sixpack on multi-sequences fasta file
0
gravatar for shiv
20 months ago by
shiv10
shiv10 wrote:

Hello everyone,

I have ~10 fasta files and each file contain more than 50 sequences and I want to get the information about 6-frame translation and ORFs for each sequences, I found Emboss-sixpack to do this work for me but when I went through the manual I got to know that it takes only single sequence as input file. Can you please suggest me with other options of this (may be I've missed) and is there any alternate option to do this thing without splitting the files ..

Thanks in advance

assembly gene • 679 views
ADD COMMENTlink modified 20 months ago by Michael Dondrup48k • written 20 months ago by shiv10
1
gravatar for Michael Dondrup
20 months ago by
Bergen, Norway
Michael Dondrup48k wrote:

This could be a workflow combining your favorite solution of:

  1. How To Split A Multiple Fasta or this aw(k)esome code by Pierre: A: Is there a way to split single .txt file with multiple fasta sequences into indi

  2. Bash Loop For Job Submission and here: A: Bash Loop For Job Submission (needed to be fully parameterized, otherwise sixpack asks for user input, and it doesn't seem to read or write to STDIN/STDOUT)

This should work without having to install any additional software on linux and mac

awk '/^>/ {if(x>0) close(outname); x++; outname=sprintf("_%d.fa",x); print > outname;next;} {if(x>0) print >> outname;}' *.fasta

for f in _*.fa
do
    sixpack -sequence $f -outfile $f.sixpack.out -outseq $f.sixpack.fa
done

If you want to have the output in a single file, use cat to combine them.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Michael Dondrup48k

I think the last line of this answer might be what you want to do. Here's an example to concatenate files:

cat *.fasta > bigfasta.fasta

Also, I'm pretty sure the EMBOSS suite is in a public Galaxy server out there for easy access.

ADD REPLYlink modified 20 months ago • written 20 months ago by colindaven2.6k

Hi,

Thanks, Michael.. I am trying to implement your suggestion (this awk command) with Python, but I am getting syntax error every time, Can you please help me in this. Below is code :

import subprocess

cmd = "awk '/^>/ {if(x>0) close(outname); x++; outname=sprintf("_%d.faa",x); print > outname;next;} {if(x>0) print >> outname;}' {}".format(FASTA_FILE_PATH)

subprocess.call(cmd, shell=True)

  • FASTA_FILE_PATH : full path of input fasta file
ADD REPLYlink modified 20 months ago • written 20 months ago by shiv10

Why call awk from python? Anyway you have to adapt the quotes. E.g. escape them or use alternative quotes. Like qq in Perl. Don’t know if python has similar functionality

Of course it has something similar... https://stackoverflow.com/questions/29559905/does-python-have-an-equivalent-of-perls-qq

but simply not as powerful 💪

ADD REPLYlink modified 20 months ago • written 20 months ago by Michael Dondrup48k

Hi,

As I said that I have more than one fasta file so I just tried to write python script to get the path of input fasta file.. Now my problem has been solved.. Thank you so much for help !!

ADD REPLYlink written 20 months ago by shiv10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2609 users visited in the last hour
_