Question: How to specify output format in stand alone emboss-SKIPREDUNDANT?
0
gravatar for Dineshkumar K
10 months ago by
Kasaragod, Kerala, India
Dineshkumar K30 wrote:

Hi all, I have been making non-redundant sequence database for BLAST analysis. Therefore, I have used stand alone emboss-skipredundant tool. It works perfectly, but the output sequence fasta headers been renamed as EMBOSS1, EMBOSS2 .....etc. However, I need to have the original fasta headers in the output file. So, could anyone help me to to do the same. Thanks in advance.

ADD COMMENTlink written 10 months ago by Dineshkumar K30
1
gravatar for genomax
10 months ago by
genomax76k
United States
genomax76k wrote:

Check this option

 Advanced (Unprompted) qualifiers:
   -feature            toggle     Sequence feature information will be
                                  retained if this option is set.
ADD COMMENTlink modified 10 months ago • written 10 months ago by genomax76k

Thank you genomax, I have used the following command line, but it shows error as follows

ga@pls:~/Desktop/test$ skipredundant -threshold 95.0 -feature toggle 
Remove redundant sequences from an input set
Error: Failed to open filename 'toggle'
Error: Unable to read sequence 'toggle'
Died: skipredundant terminated: Bad value for '-sequences' and no prompt
ADD REPLYlink modified 10 months ago • written 10 months ago by Dineshkumar K30

You just need to use -feature. Including that should set the feature to on. So it acts as a toggle.

BTW: How did your fasta headers get renamed? I tested a small sample and mine did not.

ADD REPLYlink modified 10 months ago • written 10 months ago by genomax76k

I used the following command and got the response like this on the terminal,

skipredundant -feature

ga@pls:~/Desktop/test$ skipredundant -feature
Remove redundant sequences from an input set Input sequence set: test1.fasta Redundancy removal options 1 : Single threshold percentage sequence similarity 2 : Outside a range of acceptable threshold percentage similarities Select number [1]: 1 The percentage sequence identity redundancy threshold. [95.0]: Gap opening penalty [10.0]: Gap extension penalty [0.5]: output sequence(s) [test1.keep]: dineshtest1.fasta Redundant sequences (optional): dineshtest2.fasta Warning: No features written to output file 'dineshtest1.gff' Warning: No features written to output file 'dineshtest2.gff'

The fasta headers have been changed like this,

>EMBOSS_001
>EMBOSS_002
ADD REPLYlink modified 10 months ago • written 10 months ago by Dineshkumar K30

online EMBOSS-skip redundant server also showing error like as `

ERROR   application terminated
Error: Failed to open filename 'test1.fasta'
Error: Unable to read sequence 'test1.fasta'
Died: skipredundant terminated: Bad value for '-sequences' with -auto defined
ADD REPLYlink modified 10 months ago • written 10 months ago by Dineshkumar K30
1

What do your original headers look like (post output of grep "^>" your.fasta | head -5). It is possible that you have spaces in your fasta headers and the part before the first space is not unique across your sequences. That may be the reason why the headers are being changed.

In my test, the headers where left alone.

ADD REPLYlink written 10 months ago by genomax76k

Thank you genomax, As you said, header changes happened because of the space in my input fasta sequence headers. Once, I remove the space, its retains the original header.

ADD REPLYlink written 10 months ago by Dineshkumar K30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1707 users visited in the last hour