I am working on a particular mouse strain. I aligned the reads and called for the variants and then annotated the variants against Ensembl gene models. Now, I have the list of Stop codon lost and Stop codon gained genes.
As we always compare the donor genome to reference genome, a stop codon gained in donor gene is same as stop codon lost in reference gene. How we can know which genome has the functional gene ? It may happen that the stop codon for the translation is at the right position in the donor genome but there was this mutation in reference gene which is making a longer form of the same protein (lets assume it to be non-functional protein).
Is there any way we can say that the mutation is specific to the reference genome or the donor genome ?