Question: De novo mutation definition
I'm new in bioinformatics. I'm looking for an exact definition of de novo mutation in trios. I've read this definition, but it's not including what I'm looking for (or I didn't understand it). " A genetic alteration that is present for the first time in one family member as a result of a variant (or mutation) in a germ cell (egg or sperm) of one of the parents, or a variant that arises in the fertilized egg itself during early embryogenesis. Also called de novo variant, new mutation, and new variant."

For example, if the child is A|T and father is C|C and mother is A|T, it would be a de novo mutation or not? Should the variant be completely different with the parents to consider it as a de novo mutation?

Kevin Blighe39k
You can think of it as a new mutation that is introduced into the germline of a particular family. As indicated in the text that you have para-phrased, these mutations occur in the male sperm and/or female eggs. The de novo mutation will then become 'fixed' (propagated) in the family lineage if the egg or sperm cell that contain the mutation are fertilised to form an embryo.

You have a father with C|C and mother with A|T - the child is A|T. This could be a case of a de novo mutation; however, there are other possibilities:

  • sequencing error
  • alignment error
  • variant calling error
  • Uniparental disomy (UPD), where the child only inherited genetic content from the mother

There are likely many more possibilities. If it is UPD, then other nearby variants likely exhibit the same phenomenon.

It would greatly help if you mentioned the chromosome in which this variant is located. The patterns of inheritance are different between autosomes (non-sex) and allosomes (sex chromosomes). Let's not forget mitochondria, too.


Thanks for your answer. Well, this was just an example to explain what is my problem. Actually, in a VCF file, I am looking for de novo mutation in a particular family, and I have counted 760 mutations that was seen in child, and it was not inherited from parents just like the above-mentioned example. Therefore, I thought maybe I am considering mutations that should not be considered as de novo because de novo mutations are rare. It should be mentioned that the genotype of the samples are unphased, and samples are healthy. In this case, what should I do?

Have you pre-filtered the VCF for low read depth and based on the QUAL score? A lot of the variants in the child that are not observed in the parents may be from sequencing errors. How have you produced the VCF?

The QUAL column is ".", so there is no information. No I have not produced the VCF file. It is a VCF file from 1000 genome project data set. Here is the link:

I have searched for clearer definition of de novo mutation. I think the mutation that is not in either of the parents should be considered as a de novo mutation. For example, child A|T father G|G mother C|C is a denovo but the example that I have mentioned in my question is not a de novo. Please correct me if I am wrong

In your original question, you had:

  • father, C|C
  • mother, A|T
  • child, A|T

This indicates that the child obtained both alleles from the mother (uniparental disomy). The possibility exists that the child inherited a C from the father, which was de novo mutated into an A or T. The possibility also exists that the father is not the biological father.

In your recent example, you have:

  • father, G|G
  • mother, C|C
  • child, A|T

Here, these parents are either not the biological parents of the child, or both the egg and sperm that fused were both de novo mutated at this position.

Thank you for your comment.

Regarding the VCF file, what should I do? There are a lot of mutations that have been reported in the file and could be de novo, but this is not right because de novo must be rare. The file does not have any QUAL part so that I can filter the calls.

And what are the allele frequencies of the variants that you have found?

I have counted the number of mutation in a number of trios. For example, the sample X has 760 de novo mutations based on what we discussed. In the VCF file, the FILTER part of most of the positions is PASS. What could be wrong? I have checked the code, and everything is OK! The samples are healthy, and I am sure that the parents are biological parents.

Is there any text book explaining de-novo mutation? I want to refer to it.

Thank you

You could just cite the NIH:

...or, go to a search engine and look for ncbi de novo mutation

