Check whether a fasta header has sequence underneath it
1
1
Entering edit mode
6.1 years ago
utkarsh.sood ▴ 40

Hello 

I have a .ffn file having 1000 sequences. I wanted to check whether all the fasta headers have sequence underneath them, and it would be great if I also get to know about the fasta headers which do not have sequences underneath them.

 

Thanks

myposts sequencing sequence • 1.1k views
ADD COMMENT
4
Entering edit mode
6.1 years ago
mkulecka ▴ 320

BioPython is ideal for this.

from Bio import SeqIO

records=list(SeqIO.parse("your_file.fasta","fasta"))
for record in records:
    if len(record.seq)==0:
        printrecord.id)

This will give you ids of headers without any sequences.

ADD COMMENT
1
Entering edit mode

The key message here is that most commonly-used languages have libraries to parse fasta; then it's just a case of applying the appropriate methods (string length in this case).

ADD REPLY

Login before adding your answer.

Traffic: 1848 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6