Question: Add Dp Tag To Genotype Field Of Vcf File
0
gravatar for jetsejacobi
5.2 years ago by
jetsejacobi0 wrote:

Hello everyone,

I am using samtools mpileup for SNP calling, then with BEAGLE I do the haplotyping and with GATK BeagleOutputToVCF I convert the beagle output back to vcf format. Everything is working fine, but I miss one tag.

I want to add the DP tag to the genotype field of the vcf file. Is there an option in samtools mpileup, BEAGLE or GATK BeagleOutputToVCF which can add this information? Do I have to use the tool mentioned in incorporating raw read coverage per sample in merged vcf which requires much IO for calculation and is an extra step in my pipeline? Or is this information somewhere in my vcf file? My vcf file looks like this:

SL2.40ch12      17      .       T       C       69.50   .       AC=1;AC1=1;AF=0.167;AF1=0.1766;AN=6;DP=39;DP4=10,7,2,3;FQ=70.3;MQ=46;NumGenotypesChanged=0;PV4=0.62,0.24,0.034,1;R2=0.922;RPB=5.484225e-01;VDB=3.008871e-02     GT:GQ:OG:PL
     0|0:13:.:0,9,90 0|0:60:.:0,39,255       0|1:21:.:104,0,11
vcf • 3.8k views
ADD COMMENTlink modified 2.6 years ago by Pablo Marin-Garcia1.8k • written 5.2 years ago by jetsejacobi0

From your example, there is already a DP tag. DP = 39 here.

ADD REPLYlink written 5.2 years ago by Jordan1.0k

This is the coverage summed over all different samples, I need the coverage per sample

ADD REPLYlink written 5.2 years ago by jetsejacobi0
2
gravatar for Bpow
5.2 years ago by
Bpow200
United States
Bpow200 wrote:

If your version of samtools is new enough (it's present at least in 0.1.18), you can provide the '-D' option to mpileup to get per-sample read depth of high-quality reads (DP in genotype field) and high-quality variant reads (DV in genotype field) (as opposed to the depth across samples, which is indicated by the DP field in the INFO field).

ADD COMMENTlink written 5.2 years ago by Bpow200

Thanks! this was exactly what I was looking for. Don't know why I couldn't find this option by my own. Now I read the manual again and I saw you were right!

ADD REPLYlink written 5.2 years ago by jetsejacobi0
1
gravatar for Pablo Marin-Garcia
2.6 years ago by
Spain
Pablo Marin-Garcia1.8k wrote:

Just to keep this updated for samtools 1.2 onwards the -D is deprecated now there is an option --output-tag DP

For samptools 1.2,  mpilup -t has DP,DPR,DV,DP4,INFO/DPR,SP

for 1.3 DP4 has changed to ADF (Allelic depths on the forward strand, FORMAT) and ADR (Allelic depths on the reverse strand, FORMAT)

Possible -t values for 1.3:  Comma-separated list of FORMAT and INFO tags to output (case-insensitive): 

AD (Allelic depth, FORMAT),
INFO/AD (Total allelic depth, INFO), 
ADF (Allelic depths on the forward strand, FORMAT), 
INFO/ADF (Total allelic depths on the forward strand, INFO), 
ADR (Allelic depths on the reverse strand, FORMAT), 
INFO/ADR (Total allelic depths on the reverse strand, INFO), 
DP (Number of high-quality bases, FORMAT), 
DV (Deprecated in favor of AD; Number of high-quality non-reference bases, FORMAT), 
DPR (Deprecated in favor of AD; Number of high-quality bases for each observed allele, FORMAT), 
INFO/DPR (Number of high-quality bases for each observed allele, INFO), 
DP4 (Deprecated in favor of ADF and ADR; Number of high-quality ref-forward, ref-reverse, alt-forward and alt-reverse bases, FORMAT), 
SP (Phred-scaled strand bias P-value, FORMAT) [null]

ADD COMMENTlink written 2.6 years ago by Pablo Marin-Garcia1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 895 users visited in the last hour