Using xdformat on fasta from REPBASE returns erorr.
0
0
Entering edit mode
8.3 years ago

I am trying to use xdformat to make xdf files from fasta files downloaded from REPBASE - http://www.girinst.org/server/archive/RepBase14.04.

I concatenated all the references:

cat RepBase14.04.fasta/*.ref RepBase14.04.fasta/appendix/*.ref > repbase_all.fasta

and ran the following command using xdformat

./xdformat -n -o db/repbase_all.fasta ~/projects/REPCLASS/1.0.1/repbase/repbase_all.fasta

However, I am getting this error:

XDFORMAT 3.0PE-AB [2009-10-30] [macosx-10.5-x64-I32LPF64 2009-10-30T16:56:42]
Start:  2016-01-30T15:13:46
XDF Output Database:  db/repbase_all.fasta
 Alphabet:  NCBI2na.1
Input: "~/projects/REPCLASS/1.0.1/repbase/repbase_all.fasta"
  In sequence "Mariner-N4_NV" (record no. 11074):
    Invalid letter code encountered:  "l" (0x6c hex)
    Invalid letter code encountered:  "e" (0x65 hex)
XDF database removed

I tried the same steps with the latest version of RepBase but got the same error.

The xdformat program was downloaded from AB-BLAST - http://www.advbiocomp.com/blast.html

Could someone please let me know if they have resolved a similar error? Or is there an alternate version of xdformat which doesn't have this problem?

blast transposons xdformat repbase • 1.8k views
ADD COMMENT
0
Entering edit mode

I think this error is not with your xdformat program, your input sequence is having some ambiguous characters as reported in the error, have a look on the sequence entry named "Mariner-N4_NV" in your repbase_all.fasta file.

ADD REPLY

Login before adding your answer.

Traffic: 2113 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6