Question: Error extracting faa sequences from multifasta using faidx
0
gravatar for dago
4.1 years ago by
dago2.6k
Germany
dago2.6k wrote:

I have a multifasta faa file from which I want to extract some seqeunces using Ids.

Multifasta example:

>NITMOv2_RS22300
MAKTVAVVIREDPRRTHRPVEALRIALGLVAGNHATTVVLLNEAARLLSEDTDDVVDVEI
LEKYLPSIQQLEVPFVLPEFIDRSGVRTDFAVRYESDDTIRRLLQSMDRTLVF
>NITMOv2_RS22305
MSLSSSVYLIRKSAAALSPTLYVSGDSDWVVVEIGEDKRSSDYRELLELVLHAEKVITL

 

Ids example:

NITINOP_v2_3300
NITINOP_v2_3307

I usually do this using the following command

xargs faidx -d "" MULTIFASTA < Ids

It always worked fine, but with some new files it started to give me the following error I cannot understand:

Traceback (most recent call last):
  File "/usr/local/bin/faidx", line 9, in <module>
    load_entry_point('pyfaidx==0.3.4', 'console_scripts', 'faidx')()
  File "/usr/local/lib/python2.7/dist-packages/pyfaidx/cli.py", line 132, in main
    write_sequence(args)
  File "/usr/local/lib/python2.7/dist-packages/pyfaidx/cli.py", line 33, in write_sequence
    fasta = Fasta(args.fasta, default_seq=args.default_seq, strict_bounds=not args.lazy, split_char=args.delimiter)
  File "/usr/local/lib/python2.7/dist-packages/pyfaidx/__init__.py", line 527, in __init__
    read_ahead=read_ahead, mutable=mutable, split_char=split_char)
  File "/usr/local/lib/python2.7/dist-packages/pyfaidx/__init__.py", line 218, in __init__
    raise FastaIndexingError(e)

 

Any suggestion, what am I missing here?

 

 

 

software error genome • 1.3k views
ADD COMMENTlink modified 4.1 years ago by geek_y10k • written 4.1 years ago by dago2.6k
1

Alternatively you can use faSomeRecords,

./faSomeRecords input.faa ids.txt output.faa

Here is how A: perl code to extract sequences from multi-line fasta works on all test files but

ADD REPLYlink modified 4 months ago by RamRS25k • written 4.1 years ago by venu6.3k

Thanks very much. This is another good option I guess.

However, whenever I try to run it I get the following error:

Unrecognized character \x7F; marked by <-- HERE after <-- HERE near column 1 at faSomeRecords line 1.

I try to look into the code but I cannot even open it. Any suggestion?

EDIT

I did not make it executable...sorry! Thanks it worked!

ADD REPLYlink modified 4 months ago by RamRS25k • written 4.1 years ago by dago2.6k
0
gravatar for geek_y
4.1 years ago by
geek_y10k
Barcelona
geek_y10k wrote:

Try with proper IDs.

NITINOP_v2_3300
NITINOP_v2_3307

Both are not present in your example and it works fine if I use NITMOv2_RS22300. So the problem might be with the match between Ids between fasta and Ids.txt

ADD COMMENTlink modified 4 months ago by RamRS25k • written 4.1 years ago by geek_y10k

Thanks. I just reported an example. I checked and my Ids are present in the multifasta I am using

ADD REPLYlink modified 4 months ago by RamRS25k • written 4.1 years ago by dago2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1163 users visited in the last hour