8 months ago by
Here's how I would do it with RTG Tools. This assumes your samples are named "father", "mother", "son" with their calls contained in block-compressed, tabixed VCFs named father.vcf.gz, mother.vcf.gz, son.vcf.gz respectively:
rtg vcfmerge father.vcf.gz mother.vcf.gz child.vcf.gz \
--add-header "##PEDIGREE=<Child=son,Mother=mother,Father=father>" \
--add-header "##SAMPLE<ID=son,Sex-MALE>" \
rtg mendelian -t /path/to/referencegenome.sdf --input trio.vcf.gz \
--lenient --output-inconsistent trio-non-mendelian.vcf.gz
Adjust appropriately if your child is female or for what your particular sample names are. The reference genome is used to adjust the Mendelian inheritance rules appropriately for the sex chromosomes, and is created as a one-off process via:
rtg format -o /path/to/referencegenome.sdf /path/to/referencegenome.fasta
(For typical human reference genomes the sex chromosomes will be automatically recognized by the format command)
You should also be aware of the fact that when your samples have been called separately, you can end up with some variation in the representation of variants (particularly complex variants involving indels or several variants in close proximity) in the VCF that may make it look like there is mendelian inconsistency when there actually is not.
The ideal solution to this is to jointly call all the members of your trio at once (preferrably with a pedigree aware caller like those in RTG Core) to ensure the variants are consistently represented across the trio. The next best solution to deal with this is to use a Mendelian comparison tool that is aware of the representation issue, such as VBT. The next best solution is to apply external decomposition and normalization tools (and there are many of these, included in tools such as
bcftools) to the input VCFs prior to comparison.