Entering edit mode
3 months ago
Ayish • 0
I have a large fasta file containing both nucleotide and protein sequences. I need to separate the sequences into two files based on the type of sequence. Is there any Python module that can look for ?
Thanks in advance.
Does the sequence identifier lines tell you whether they're DNA or protein? Or you gonna have to guess for a peptide made out of Glycine, Alanine, Cysteine and Threonine?
Unfortunately, No. It would be guess work, I think.
Here is a snippet in python. Or you can try out the biopython module too. But be aware this guessing work can go very wrong if you have UIPAC nucleotide symbols other than