My task is to compare a yeast wt and a mutant sample. The mutant is known to bear a mutation at a specific locus. The question is: Aside from this known mutation, are the samples significantly different?
(In ideal case, there would be no other difference and we could say that the difference in the phenotype is given by the single locus only. However, can we really expect zero mutation rate? Hardly.)
This have been my first variation analysis and I'm not sure where to go now. I have produced a VCF file (code at the end of the post). But what would be a convincing result here? Statistics of different kinds of variations between samples? Compared to what?
The only reasonable step I can see is to focus on protein-coding genes and check potential amino-changing mutations. But this would be a crude simplification.
VCF calculation:
# samtools mpileup -uf reference.fasta wildtype.sorted.bam mutant.sorted.bam | bcftools view -bvcg - > mutant-vs-wildtype.var.raw.bcf
# bcftools view mutant-vs-wildtype.var.raw.bcf | vcfutils.pl varFilter -D100 > mutant-vs-wildtype.var.flt.vcf
Right, exactly how I see it. Thank you for the tip on blasting the other strains!