Question: Paternity Testing from WGS Trio
2
gravatar for ClkElf
10 days ago by
ClkElf20
Istanbul Technical University, Turkey
ClkElf20 wrote:

Hello all,

I am a newbie in bioinformatics and would like to process a Trio (Parents and Index) in order to understand the concept. Here is my question: How can I analyse a WGS Trio in order to conduct paternity testing? Or is it possible to do this?

The question may sound stupid but I am very curious about it.

Many thanks!

paternity test dna-seq wgs trio • 139 views
ADD COMMENTlink modified 9 days ago by RamRS17k • written 10 days ago by ClkElf20
3

Take a look at Peddy

ADD REPLYlink written 10 days ago by WouterDeCoster32k
6
gravatar for Obi Griffith
9 days ago by
Obi Griffith17k
Washington University, St Louis, USA
Obi Griffith17k wrote:

It is definitely possible to assess paternity from whole genome sequence (WGS) data. Paternity can probably be established with as little as a few dozen or maybe hundreds of well-chosen single nucleotide polymorphisms (SNPs). If you have decent WGS data you can expect to genotype millions of SNPs. So, paternity assessment would be very confident from such data. What data do you have available? Assuming that you have raw sequence data (e.g., fastq or unaligned bam files) you will first need to align to an appropriate reference genome.

There are several online tutorials to give you the general idea:

Note. Both the above tutorials are a little out of date. Current best practice would be to use bwa mem (available with current bwa installations). See http://bio-bwa.sourceforge.net/bwa.shtml

Once you have aligned your data you will probably want to mark duplicate reads and perform base quality score recalibration (BQSR). For some sample commands taking you through bwa mem alignment, duplicate marking, and BQSR see here: http://pmbio.org/module%202/0002/01/31/Alignment/

Next, you will want to run GATK variant caller. For a trio analysis I suggest you try running GATK HaplotypeCaller in GVCF mode and then performing joint genotyping. See here for a tutorial on this topic: https://gatkforums.broadinstitute.org/gatk/discussion/7869/howto-discover-variants-with-gatk-a-gatk-workshop-tutorial

This is all explained in great detail in the excellent GATK Best Practices for Variant Discovery workshops organized by the Broad. See https://drive.google.com/drive/folders/1U6Zm_tYn_3yeEgrD1bdxye4SXf5OseIt

Finally, assuming you get through the above. You should have a VCF with genotype calls for millions of SNPs for your trio. You then need to look at SNP genotype concordance between individuals in your trio to estimate kinship. This is itself a complicated area of research that I am not very familiar with. But, paternity should be one of the simpler relationships to prove. I believe the KING tool is popular for this and could take the above VCF as a starting point.

http://people.virginia.edu/~wc9c/KING/manual.html

ADD COMMENTlink written 9 days ago by Obi Griffith17k
2

Or the implementation of the KING algorithm in VCFTools using the -relatedness2 argument https://vcftools.github.io/man_latest.html

ADD REPLYlink written 9 days ago by Garan530
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1372 users visited in the last hour