Question: Denovo variant detection in trios using only 3 gVCF files
1
gravatar for sriniparth
5 months ago by
sriniparth10
sriniparth10 wrote:

Hello I am working on denovo variant detection in trios and am searching for homozygous ref or alt, and hetero ref or alt in child that are not in the father or mother.

So I would call a de novo mutation candidate in the following cases

Child has genotype 0|1, 1|0 or 1|1 and both parents have 0|0 Child has genotype 1|0 or 1|1 and mother 0|0 Child has genotype 0|1 or 1|1 and father 0|0

Are there any other cases which indicate a denovo mutation which I missed so far?

denovo mutation trio • 263 views
ADD COMMENTlink modified 5 months ago by Pierre Lindenbaum121k • written 5 months ago by sriniparth10
1
gravatar for Pierre Lindenbaum
5 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:

Are there any other cases which indicate a denovo mutation which I missed so far?

child carries a HOM deletion, he would be './.' while parents are both '0/0' and '0/0' because they both carry a HET deletion because the caller didn't detect the HET state.

ADD COMMENTlink written 5 months ago by Pierre Lindenbaum121k
0
gravatar for manuel.belmadani
5 months ago by
Canada
manuel.belmadani870 wrote:

I don't think you can rely on the order of the genotype calls to identify which one is the father and the mother, i.e. I'm not even sure 1/0 can be a valid output. I would always compare with both parents and only consider:

Heterozygous de novo variant
Child = 0/1 | Mother = 0/0 | Father = 0/0

De novo homozygous variant
Child = 1/1 | Mother = 0/1 | Father = 0/0
Child = 1/1 | Mother = 0/0 | Father = 0/1

Child = 1/1 | Mother = 0/0 | Father = 0/0
This last one seems really unlikely, but I guess if you see a case like this you should count it too (or take a closer look at the genotype quality).

And by the way, it's possible that you have values other than 0/1; a . means insufficient coverage, and you can also have numbers >1 for alternative alleles. I ran a summary of the first million lines of a gVCF file and here's the summary I have:

 988916 0/0  
   6935 0/1  
     50 0/2  
      8 0/3  
   3833 1/1  
    115 1/2  
      9 1/3  
      8 2/2  
      5 2/3  
      1 3/4  
      1 4/5

Also GATK has a more formal guide on calling de novo variants, you should probably try to use something like that since it incorporates genotype quality: https://software.broadinstitute.org/gatk/documentation/article?id=11074

Step 3: Annotate possible de novo mutations
Tool used: VariantAnnotator
Using the posterior genotype probabilities, possible de novo mutations are tagged. Low confidence de novos have child GQ >= 10 and AC < 4 or AF < 0.1%, whichever is more stringent for the number of samples in the dataset. High confidence de novo sites have all trio sample GQs >= 20 with the same AC/AF criterion.

ADD COMMENTlink modified 5 months ago • written 5 months ago by manuel.belmadani870
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1691 users visited in the last hour