I am very new to programming in python. I have protein fasta files of species of plants.
I would like to filter them based on the number of amino acids each sequence contain. Criteria is those sequences >20 amino acids.
I am able to get the amino acids bigger than 20 with the resources on biopython cookbook. However,when i try to write them on the file. It gives me error. I am unable to reproduce it. Moreover, I would also like to have IDs of each sequence in the output file. Please help me!
import Bio from Bio import SeqIO for s_record in SeqIO.parse('arabidopsis_thaliana_proteome.ath.tfa','fasta'): name = s_record.id seq = s_record.seq seqLen = len(s_record) if seqLen >20: desired_proteins=seq output_file=SeqIO.write(desired_proteins, "filtered.fasta","fasta") output_file
Thank you in advance :)