Question

Genome wide variation between mutant and wild type - is it significant

0

Entering edit mode

7.8 years ago

Jan Hapala • 0

My task is to compare a yeast wt and a mutant sample. The mutant is known to bear a mutation at a specific locus. The question is: Aside from this known mutation, are the samples significantly different?

(In ideal case, there would be no other difference and we could say that the difference in the phenotype is given by the single locus only. However, can we really expect zero mutation rate? Hardly.)

This have been my first variation analysis and I'm not sure where to go now. I have produced a VCF file (code at the end of the post). But what would be a convincing result here? Statistics of different kinds of variations between samples? Compared to what?

The only reasonable step I can see is to focus on protein-coding genes and check potential amino-changing mutations. But this would be a crude simplification.

VCF calculation:

# samtools mpileup -uf reference.fasta wildtype.sorted.bam mutant.sorted.bam | bcftools view -bvcg - > mutant-vs-wildtype.var.raw.bcf
# bcftools view mutant-vs-wildtype.var.raw.bcf | vcfutils.pl varFilter -D100 > mutant-vs-wildtype.var.flt.vcf

SNP genome-wide yeast variation • 1.9k views

ADD COMMENT • link updated 7.8 years ago by swbarnes2 14k • written 7.8 years ago by Jan Hapala • 0

score 1 · Answer 1 · 2016-07-25

I think the crude simplification is the best you can do. (Another thing you could try is to BLAST your those other mutations against nr, or some database of other yeast strains...if the mutations are in those strains, and those strains are phenotypiclaly normal, then the mutation likely doesn't do anything).

There is no objective metric by which you can say "This strain is significantly different". And there is no way to just look at a SNP and know how profound the consequences will be to the protein, or the organism. You just can't answer this conclusively in silico. People at the bench could correct the mutation, and see how the corrected yeast functions in whatever assays they care to try, that's the only way to be sure.

Best thing you could say is that you don't predict any significant impairment of this or that pathway, based on a lack of large amino acid changing mutations in this and that set of genes. But you can't be sure based on sequence data alone.