Question: Possible bug in seqkit?
1
gravatar for lakhujanivijay
2.5 years ago by
lakhujanivijay4.2k
India
lakhujanivijay4.2k wrote:

Tool used: seqkit

Dummy fasta file (fasta.fa):

>test1
GCATCGATCAGCTACGATCATCACTA
GNNNNNNTACATCAGCACTACATCACTNNNNN
>test2
GTACGCTACGANNNGCTACGACTACGATATATATATATATATATATATATATATATATATATAT
GCTACGATCACNTACATCGACTA
>test3
GTGTGCTACATCATCACTACGTACTACAT
>test4
AA

Command:

./seqkit stat fasta.fa

Output:

file      format  type  num_seqs  sum_len  min_len  avg_len  max_len
fasta.fa  FASTA   DNA          4      176        0       44       87

Problem: min_len =0 (however, minimum length should be 2; sequence id "test4")

Validation using seqkit:

Command:

./seqkit fx2tab -l fasta.fa

Output:

test1   GCATCGATCAGCTACGATCATCACTAGNNNNNNTACATCAGCACTACATCACTNNNNN      58
test2   GTACGCTACGANNNGCTACGACTACGATATATATATATATATATATATATATATATATATATATGCTACGATCACNTACATCGACTA     87
test3   GTGTGCTACATCATCACTACGTACTACAT       29
test4   AA      2

Notice: length of sequence test4 is "2"

Is it a bug or I misunderstood something?

PS: I am loving this tool (all thanks to Wei Shen)and trying to exploit the utilities to make a new tool!

stats seqkit fasta • 913 views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by lakhujanivijay4.2k
2

You might have better luck posting a bug report on the github repo.

ADD REPLYlink written 2.5 years ago by Devon Ryan91k
2

Oh my dear friend, it's shenwei, or Wei Shen. In Chinese, the last name (Shen) is in front of the first name (Wei), so my social media ID is shenwei*

ADD REPLYlink written 2.5 years ago by shenwei3564.7k

Oh my dearest friend!, thanks for the information but I just wanted to highlight your username.

Many thanks for your prompt attention!!

PS: I just edited my post :)

ADD REPLYlink written 2.5 years ago by lakhujanivijay4.2k
1
gravatar for John
2.5 years ago by
John12k
Germany
John12k wrote:

For the data you posted I get the correct result using Version: 0.3.4.1

file  format  type  num_seqs  sum_len  min_len  avg_len  max_len
demo  FASTA   DNA          4      176        2       44       87
ADD COMMENTlink written 2.5 years ago by John12k
2

Sorry for that naive bug , it's fixed in the latest version (v0.4.3), please update.

Affected verions: v0.4.0, v0.4.1, v0.4.2

Please use seqkit version to check version, and download from Github page or homepage, do not install or update using conda (latest there: v0.3.4.1) which is not maintained by me.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by shenwei3564.7k
1

I'll update the bioconda recipe for seqkit to use the latest version.

ADD REPLYlink written 2.5 years ago by Devon Ryan91k

Thanks, I'll learn to use boiconda:)

ADD REPLYlink written 2.5 years ago by shenwei3564.7k
3

You'd be surprised how many people prefer to install stuff via bioconda. Anyway, the recipe has been updated and the new binaries should be available within the next hour or so (there's a queue on TravisCI at the moment).

ADD REPLYlink written 2.5 years ago by Devon Ryan91k

Yup, I am using the version v0.4.3. Results are fine for the same data. OP, which version you are using?

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by venu6.2k

I moved to latest version seqkit v0.4.3 !! Thanks venu :)

ADD REPLYlink written 2.5 years ago by lakhujanivijay4.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1551 users visited in the last hour