I'm trying to filter my contigs dataset into different files by their length such as 500bp, 1kb, 2kb... I'm using below code to produce my output.
def contigs_filter_by_length(fasta_input, size, fasta_output): long_contigs =  #Create an empty list for record in SeqIO.parse(fasta_input,"fasta"): if len(record.seq) >= size: long_contigs.append(record) print("Found %i contigs" %len(long_contigs)) SeqIO.write(long_contigs,fasta_output,"fasta")
The problem is when I crosschecked with QUAST report of my input file and the output from the code, there was a huge difference between them. QUAST indicated that there are 119787 contigs >= 500bp while the fasta output from the code showed 122046 contigs >=500bp.
Is there anything wrong in my code which lead to this difference?