How to specify output format in stand alone emboss-SKIPREDUNDANT?
1
0
Entering edit mode
2.6 years ago
Kumar ▴ 50

Hi all, I have been making non-redundant sequence database for BLAST analysis. Therefore, I have used stand alone emboss-skipredundant tool. It works perfectly, but the output sequence fasta headers been renamed as EMBOSS1, EMBOSS2 .....etc. However, I need to have the original fasta headers in the output file. So, could anyone help me to to do the same. Thanks in advance.

emboss skipredundant fasta bioinformatics • 759 views
1
Entering edit mode
2.6 years ago
GenoMax 107k

Check this option

 Advanced (Unprompted) qualifiers:
-feature            toggle     Sequence feature information will be
retained if this option is set.

0
Entering edit mode

Thank you genomax, I have used the following command line, but it shows error as follows

ga@pls:~/Desktop/test$skipredundant -threshold 95.0 -feature toggle Remove redundant sequences from an input set Error: Failed to open filename 'toggle' Error: Unable to read sequence 'toggle' Died: skipredundant terminated: Bad value for '-sequences' and no prompt  ADD REPLY 0 Entering edit mode You just need to use -feature. Including that should set the feature to on. So it acts as a toggle. BTW: How did your fasta headers get renamed? I tested a small sample and mine did not. ADD REPLY 0 Entering edit mode I used the following command and got the response like this on the terminal, skipredundant -feature  ga@pls:~/Desktop/test$ skipredundant -feature
Remove redundant sequences from an input set Input sequence set: test1.fasta Redundancy removal options 1 : Single threshold percentage sequence similarity 2 : Outside a range of acceptable threshold percentage similarities Select number [1]: 1 The percentage sequence identity redundancy threshold. [95.0]: Gap opening penalty [10.0]: Gap extension penalty [0.5]: output sequence(s) [test1.keep]: dineshtest1.fasta Redundant sequences (optional): dineshtest2.fasta Warning: No features written to output file 'dineshtest1.gff' Warning: No features written to output file 'dineshtest2.gff'

The fasta headers have been changed like this,

>EMBOSS_001
>EMBOSS_002

0
Entering edit mode

online EMBOSS-skip redundant server also showing error like as 

ERROR   application terminated
Error: Failed to open filename 'test1.fasta'
Error: Unable to read sequence 'test1.fasta'
Died: skipredundant terminated: Bad value for '-sequences' with -auto defined

1
Entering edit mode

What do your original headers look like (post output of grep "^>" your.fasta | head -5`). It is possible that you have spaces in your fasta headers and the part before the first space is not unique across your sequences. That may be the reason why the headers are being changed.

In my test, the headers where left alone.

0
Entering edit mode

Thank you genomax, As you said, header changes happened because of the space in my input fasta sequence headers. Once, I remove the space, its retains the original header.