Question: Error about Count_fasta.pl
0
gravatar for HZZ0036
2.1 years ago by
HZZ00360
HZZ00360 wrote:

Hello Everyone,

I tried to use the Count_fasta.pl script (https://github.com/sr320/eimd/blob/master/scripts/count_fasta.pl) to get the N50 for one assembled fasta file. However, when I ran perl Count_fasta.pl file.fa ,it showed:

0:99    0

Total length of sequence:       0 bp
Total number of sequences:      0
N25 stats:                      25% of total sequence length is contained in the 0 sequences >=  bp
N50 stats:                      50% of total sequence length is contained in the 0 sequences >=  bp
N75 stats:                      75% of total sequence length is contained in the 0 sequences >=  bp
Total GC count:                 0 bp
Illegal division by zero at /home/aubhxz/Count_fasta.pl line 103.

But this Count_fasta.pl works for other fasta files. It seems very strange. Does anybody know how to solve this problem? Thanks.

Zhang

sequencing assembly • 722 views
ADD COMMENTlink written 2.1 years ago by HZZ00360

If it works on other FASTA files, I would guess there is something wrong with the FASTA file you are working with. Did you check if your FASTA file has sequences? Did you check if it is formatted correctly?

The following error message: Illegal division by zero at /home/aubhxz/Count_fasta.pl line 103., makes me think something is wrong with your input file.

ADD REPLYlink written 2.1 years ago by sridhar56100

I can't find the error for fasta file. Here is the file link: https://filemover.auburn.edu/files/files/1510608526_re69allokay.fa Could you please have a look? How to check if it is formatted correctly? Thank you so much.

ADD REPLYlink written 2.1 years ago by HZZ00360

It seems to be working fine for me. I took a subset of the file (your example file is huge) and ran the program you mentioned. You should probably save your file again and see if it works.

300:399     17
400:499     24
500:599     35
600:699     27
700:799     42
800:899     28
900:999     24
1000:1099   21
1100:1199   13
1200:1299   12
1300:1399   7
1400:1499   7
1500:1599   0
1600:1699   1
1700:1799   1
1800:1899   1
1900:1999   1
2000:2099   3
2100:2199   1
2200:2299   0
2300:2399   1
2400:2499   1
2500:2599   0
2600:2699   0
2700:2799   1

Total length of sequence:   226219 bp
Total number of sequences:  268
N25 stats:          25% of total sequence length is contained in the 37 sequences >= 1201 bp
N50 stats:          50% of total sequence length is contained in the 91 sequences >= 914 bp
N75 stats:          75% of total sequence length is contained in the 161 sequences >= 716 bp
Total GC count:         120188 bp
GC %:               53.13 %
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by sridhar56100

Thank you. It worked for a subset of the file, but it still didn't work for all of the file. I didn't meet this situation before. It's very strange.

ADD REPLYlink written 2.1 years ago by HZZ00360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 931 users visited in the last hour