Question: How to specify output format in stand alone emboss-SKIPREDUNDANT?
0
gravatar for Dineshkumar K
25 days ago by
Kasaragod, Kerala, India
Dineshkumar K10 wrote:

Hi all, I have been making non-redundant sequence database for BLAST analysis. Therefore, I have used stand alone emboss-skipredundant tool. It works perfectly, but the output sequence fasta headers been renamed as EMBOSS1, EMBOSS2 .....etc. However, I need to have the original fasta headers in the output file. So, could anyone help me to to do the same. Thanks in advance.

ADD COMMENTlink written 25 days ago by Dineshkumar K10
1
gravatar for genomax
25 days ago by
genomax65k
United States
genomax65k wrote:

Check this option

 Advanced (Unprompted) qualifiers:
   -feature            toggle     Sequence feature information will be
                                  retained if this option is set.
ADD COMMENTlink modified 25 days ago • written 25 days ago by genomax65k

Thank you genomax, I have used the following command line, but it shows error as follows

ga@pls:~/Desktop/test$ skipredundant -threshold 95.0 -feature toggle 
Remove redundant sequences from an input set
Error: Failed to open filename 'toggle'
Error: Unable to read sequence 'toggle'
Died: skipredundant terminated: Bad value for '-sequences' and no prompt
ADD REPLYlink modified 25 days ago • written 25 days ago by Dineshkumar K10

You just need to use -feature. Including that should set the feature to on. So it acts as a toggle.

BTW: How did your fasta headers get renamed? I tested a small sample and mine did not.

ADD REPLYlink modified 25 days ago • written 25 days ago by genomax65k

I used the following command and got the response like this on the terminal,

skipredundant -feature

ga@pls:~/Desktop/test$ skipredundant -feature
Remove redundant sequences from an input set Input sequence set: test1.fasta Redundancy removal options 1 : Single threshold percentage sequence similarity 2 : Outside a range of acceptable threshold percentage similarities Select number [1]: 1 The percentage sequence identity redundancy threshold. [95.0]: Gap opening penalty [10.0]: Gap extension penalty [0.5]: output sequence(s) [test1.keep]: dineshtest1.fasta Redundant sequences (optional): dineshtest2.fasta Warning: No features written to output file 'dineshtest1.gff' Warning: No features written to output file 'dineshtest2.gff'

The fasta headers have been changed like this,

>EMBOSS_001
>EMBOSS_002
ADD REPLYlink modified 24 days ago • written 24 days ago by Dineshkumar K10

online EMBOSS-skip redundant server also showing error like as `

ERROR   application terminated
Error: Failed to open filename 'test1.fasta'
Error: Unable to read sequence 'test1.fasta'
Died: skipredundant terminated: Bad value for '-sequences' with -auto defined
ADD REPLYlink modified 24 days ago • written 24 days ago by Dineshkumar K10
1

What do your original headers look like (post output of grep "^>" your.fasta | head -5). It is possible that you have spaces in your fasta headers and the part before the first space is not unique across your sequences. That may be the reason why the headers are being changed.

In my test, the headers where left alone.

ADD REPLYlink written 24 days ago by genomax65k

Thank you genomax, As you said, header changes happened because of the space in my input fasta sequence headers. Once, I remove the space, its retains the original header.

ADD REPLYlink written 24 days ago by Dineshkumar K10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 642 users visited in the last hour