Question: Paternity Testing from WGS Trio
gravatar for ClkElf
22 months ago by
Istanbul Technical University, Turkey
ClkElf20 wrote:

Hello all,

I am a newbie in bioinformatics and would like to process a Trio (Parents and Index) in order to understand the concept. Here is my question: How can I analyse a WGS Trio in order to conduct paternity testing? Or is it possible to do this?

The question may sound stupid but I am very curious about it.

Many thanks!

paternity test dna-seq wgs trio • 951 views
ADD COMMENTlink modified 22 months ago by RamRS27k • written 22 months ago by ClkElf20
gravatar for Obi Griffith
22 months ago by
Obi Griffith18k
Washington University, St Louis, USA
Obi Griffith18k wrote:

It is definitely possible to assess paternity from whole genome sequence (WGS) data. Paternity can probably be established with as little as a few dozen or maybe hundreds of well-chosen single nucleotide polymorphisms (SNPs). If you have decent WGS data you can expect to genotype millions of SNPs. So, paternity assessment would be very confident from such data. What data do you have available? Assuming that you have raw sequence data (e.g., fastq or unaligned bam files) you will first need to align to an appropriate reference genome.

There are several online tutorials to give you the general idea:

Note. Both the above tutorials are a little out of date. Current best practice would be to use bwa mem (available with current bwa installations). See

Once you have aligned your data you will probably want to mark duplicate reads and perform base quality score recalibration (BQSR). For some sample commands taking you through bwa mem alignment, duplicate marking, and BQSR see here:

Next, you will want to run GATK variant caller. For a trio analysis I suggest you try running GATK HaplotypeCaller in GVCF mode and then performing joint genotyping. See here for a tutorial on this topic:

This is all explained in great detail in the excellent GATK Best Practices for Variant Discovery workshops organized by the Broad. See

Finally, assuming you get through the above. You should have a VCF with genotype calls for millions of SNPs for your trio. You then need to look at SNP genotype concordance between individuals in your trio to estimate kinship. This is itself a complicated area of research that I am not very familiar with. But, paternity should be one of the simpler relationships to prove. I believe the KING tool is popular for this and could take the above VCF as a starting point.

ADD COMMENTlink written 22 months ago by Obi Griffith18k

Or the implementation of the KING algorithm in VCFTools using the -relatedness2 argument

ADD REPLYlink written 22 months ago by Garan620
gravatar for WouterDeCoster
22 months ago by
WouterDeCoster44k wrote:

Take a look at Peddy

ADD COMMENTlink written 22 months ago by WouterDeCoster44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1252 users visited in the last hour