I have 900 assembled Salmonella Typhimurium ST313 sequences (.FASTA files, average sequence size 4.7MB). Basic information says there are two lineages of salmonella strains - one an old lineage and another a newer one. The newer one has 22 SNP differences from the old one.
I need to separate which ones are the newer strains and which ones are the older ones. How do I get around this?
Am working from Linux.