Question: Any packages to validate FASTA file?
1
gravatar for khaeuk
19 months ago by
khaeuk80
khaeuk80 wrote:

I am trying to create a function that can take in a file and check to see if it's a valid fasta file or not ( such as making sure there's no leading tabs or spaces, the first character starts with '>', no empty lines between sequences, etc. ).

I have tried using SeqIO.parse( filename, "fasta" ), but it returned true for cases where it only had the description line with '>' and no sequence provided.

I was trying to code this, but I was wondering if there was other packages that checks validity of FASTA format?

Thanks -

validation python fasta • 2.0k views
ADD COMMENTlink written 19 months ago by khaeuk80
1

Must check this using seqkit

https://bioinf.shenwei.me/seqkit/usage/#seq

ADD REPLYlink written 19 months ago by lakhujanivijay5.1k

If you need empty records to be considered invalid, maybe you could issue a pull request to biopython

ADD REPLYlink written 19 months ago by mrals8950

You could subclass the SeqIO operations and extend the sequence checking processes for empty seqs etc?

ADD REPLYlink written 19 months ago by Joe17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1839 users visited in the last hour