Question: Looking for a way to find mutated genes in two subpopulations of bacteria
I am looking for a way to compare the genomes of 33 bacterial strains isolated from patients. I was presented with illumina Miseq 250bp paired-end reads of 16 strains isolated from the microflora of symptomatic patients, and 17 strains from unaffected patients. My goal is to identify mutated genes that explain the ability of one group to dominate the microflora, which the other group can't. I assembled the reads into contigs using Spades and ordered the contigs using Mauve contig mover. I also performed an aligment of the strains and some reference genomes using progressiveMauve.

However I am stuck further analysing these draft genomes. My question is, is there a software or script that I can use to scan these alignments and look for mutations (both SNPs and larger mutations) that occur in (most of) the strains of one group, but do not occur in (most of) the other group.

Please, correct me if I am looking the completely wrong way, I am rather new in the field.

Thanks in advance!

I used breseq and pilon (I think the first one is better.) to find mutations of single sample compared to a reference, both of which use mapping results to find small and large mutations.

But for your case, two groups with multiple samples, I didn't try it. You may find all mutations of all samples and do some intersection and subtraction.

Do you have the complete sequence of the reference?

Thanks for your reply, I do have the complete reference sequence of several comparable strains. Perhaps I could extract the core genomes of the two groups and compare them to find some leads. I'll have a look at breseq to see if I can apply this, or indeed I could try to do some intersection and subtraction of the individual results.

