Question: Creating a fasta filter by gene length
gravatar for adampepper313
16 days ago by
adampepper3130 wrote:

Hi I am working on a project and am wanting to write a command that calculates the length of my gene and outputs those genes with a length shorter than my setting point written in the command line.

I am writing my code within nano and executing it using python within a command line.

This is my code:

from Bio import SeqIO
for seq_record in SeqIO.parse(sys.argv[1], "fasta"):
   if str(len(seq_record)) < (sys.argv[2]):


However, I don’t seem to be getting the desired output.


sequencing python gene • 117 views
ADD COMMENTlink modified 16 days ago by Renesh1.9k • written 16 days ago by adampepper3130

I think you should use len(str(seq_record)), since you're interested in the length of a string.

ADD REPLYlink written 16 days ago by Fatima830
gravatar for Renesh
16 days ago by
United States
Renesh1.9k wrote:

Simplly use len(seq_record) < sys.argv[2] or len(seq_record.seq) < sys.argv[2]

Alternatively, you can try bioinfokit in Python

from bioinfokit.analys import fasta
fasta_iter = fasta.fasta_reader(file='fasta_file')
for record in fasta_iter:
    header, sequence = record
    # gene length cut-off
    if len(sequence) < desire_gene_length:
         print(header, sequence)

See more here

ADD COMMENTlink written 16 days ago by Renesh1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2171 users visited in the last hour