Usearch error "Zero length sequence not allowed"
1
0
Entering edit mode
9.3 years ago
sentausa ▴ 650

Hi all,

I'm trying to dereplicate fasta sequences in a 15 GB fasta file using usearch -derep_fulllength with this kind of command:

usearch -derep_fulllength input.fasta -output uniques.fasta -sizeout

but I got the error message: "Zero length sequence not allowed". Anybody knows what it means? Does it mean that there is a fasta sequence in my file that does not contain a sequence (i.e. contains only the > part)?

The next question would be, how to check if I have such zero length sequence in my file?

Thank you for your inputs.

dereplication software-error usearch • 3.2k views
ADD COMMENT
2
Entering edit mode
9.3 years ago
Ram 43k

Using Heng Li's bioawk, the command

bioawk -c fastx ' len($seq)==0 { print $name }' <inFile.fasta

should help you find if your file has any zero length sequences.

ADD COMMENT
0
Entering edit mode
Thanks again! I'll check it out.
ADD REPLY
0
Entering edit mode

Sorry to bother you again. I got this error when I tried the above command:

syntax error at source line 1
 context is
     >>>  len($seq)= <<< 0 { print $name }
ADD REPLY
0
Entering edit mode

Got it. It should have been

length($seq)==0
ADD REPLY
0
Entering edit mode

I apologize for the typo. It should've been ==. I've updated it now.

ADD REPLY

Login before adding your answer.

Traffic: 2559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6