BioNano fasta file formatting issue
0
0
Entering edit mode
4.8 years ago
Ric ▴ 430

Hi, We have received a FASTA from BioNano. I ran reformat.sh -Xmx=8080m in=HYBRID_SCAFFOLD.fa out=HYBRID_SCAFFOLD-reformat.fa but got the bellow error

   ...    
ATATGAAATGAGTCAGAAAGACCTCTAGTTATAGGTGGTAGAGGTTGTATGTCTGTTGCTTAATAGGCAATCAATCATGCCGATTTTATGCCAAGATAATCGATTATTCACTTTACATAATCGATTATCTTCTAAACGTTCataaaaacggaacg'

This can be bypassed with the flag 'tossjunk', 'fixjunk', or 'ignorejunk'
    at shared.KillSwitch.kill(KillSwitch.java:99)
    at stream.Read.validateCommonCase_branchless(Read.java:375)
    at stream.Read.validate(Read.java:113)
    at stream.Read.<init>(Read.java:76)
    at stream.FastaReadInputStream.generateRead(FastaReadInputStream.java:269)
    at stream.FastaReadInputStream.fillList(FastaReadInputStream.java:183)
    at stream.FastaReadInputStream.hasMore(FastaReadInputStream.java:108)
    at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:664)
    at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:653)

While running BWA mem I got [bns_restore_core] Parse error reading HYBRID_SCAFFOLD.fasta.amb

What did I miss?

Thank you in advance

assembly sequence genome • 1.0k views
ADD COMMENT
0
Entering edit mode

reformat.sh is used for format conversions from one format to other. What exactly are you trying to do here? If your fasta file is corrupt then you could use one of the options noted above but that may not really address the problem.

Use validateFiles from Jim Kent's utilities to make sure your fasta file is not corrupt. Add execute permission chmod u+x validateFiles after you download.

ADD REPLY
0
Entering edit mode

Thank you for the tool. It found AGAGATATTGTCATATGAAATGAGTCAGAAAGACCTCTAGTTATAGGTGGTAGAGGTTGTATGTCTGTTGCTTAATAGGCAATCAATCATGCCGATTTTATGCCAAGATAATCGATTATTCACTTTACATAATCGATTATCTTCTAAACGTTCataaaaacggaacg) Aborting .. found 1 error

Unfortunately, it does not tell what it is and how to fix it?

ADD REPLY
0
Entering edit mode

Looks like it has issues with the lower case bases ? Or the ")" at the end ? Perhaps try redownloading ? Also biopython read fasta and then write fasta methods work well for removing nasty FASTA issues if you can write python, I used to use it all the time.

Can post a script next week if you like ?

ADD REPLY
0
Entering edit mode

Thank you, this would be nice.

ADD REPLY
0
Entering edit mode

You could remove the lower case bases by doing

sed -n '/[^[:lower:] ]/p' your.fa > new.fa

The ) can be removed by

sed 's/)//' your.fa > new.fa

You may only need to remove ) so try that first.

ADD REPLY

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6