Hi, first post here. So I'm trying take the CDS out of various species' orthologous sequences. I'm running on a Linux server, and am mainly aiming to use BioPython or Linux programs for this.
I've run OrthoFinder on 28 species of seaweed, which gave out roughly 10,000 orthogroup sequences fasta files, each of which is a a multi-fasta file. I've concatenated each of them into one huge multifasta file, and now I want to extract the fasta files according to their species into a new multifasta file (so 10k files -> 1 file -> 28 files, one per species).
How do I do this? I'm still fairly new to BioPython, so I'm still wrapping my head around things. I know I'll definitely need SeqIO, not sure what other libraries I'll need. I already have a text file with all the species listed, one per line.
Thanks heaps for any help. Lachlan