Question: How to interpret genotypes with DP=1 for a vcf file
0
gravatar for zhangdezhi008
8 months ago by
zhangdezhi0080 wrote:

Dear all, I used samtools for SNPs calling and vcftools for SNPs filtering. I got a vcf file with a lot SNPs.How to interpret the genotypes when DP=1? In my opinion, DP=1 means that this site has only one read, it could be homozygous 0/0 or 1/1, but how can it be a heterozygous 0/1?

The following is what I have observed in my vcf file.

Thanks for your attentions! Please help!

GT:PL:DP:SP:GQ 0/0:0,3,36:1:0:4

GT:PL:DP:SP:GQ 0/1:0,3,36:1:0:4

GT:PL:DP:SP:GQ 1/1:0,3,36:1:0:4

sequencing snp next-gen • 312 views
ADD COMMENTlink modified 8 months ago by b.nota3.6k • written 8 months ago by zhangdezhi0080

It can be heterozygous if you find one read with the alternative base (instead of the reference base). But do you trust these DP=1 calls? I mean 1 read is pretty minimal.

Edit: I see your point, you mean with one read of the alt allele it could be both homozygous or heterozygous.

ADD REPLYlink modified 8 months ago • written 8 months ago by b.nota3.6k

Thanks for your reply. I will not trust the lower DP SNPs. How can one read possesses an alternative base? I also have another question, how can we know the depth for each allele for a heterozygous site?

ADD REPLYlink written 8 months ago by zhangdezhi0080

I am not sure about your method, I never used vcftools for this. When using varscan after samtools, I get more info than you get: e.g.,

GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR
1/1:117:31:21:0:21:100%:1.8578E-12:0:25:0:0:5:16

With other meaning for DP as well:

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=SDP,Number=1,Type=Integer,Description="Raw Read Depth as reported by SAMtools">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Quality Read Depth of bases with Phred score >= 15">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)">
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency">
##FORMAT=<ID=PVAL,Number=1,Type=String,Description="P-value from Fisher's Exact Test">
##FORMAT=<ID=RBQ,Number=1,Type=Integer,Description="Average quality of reference-supporting bases (qual1)">
##FORMAT=<ID=ABQ,Number=1,Type=Integer,Description="Average quality of variant-supporting bases (qual2)">
##FORMAT=<ID=RDF,Number=1,Type=Integer,Description="Depth of reference-supporting bases on forward strand (reads1plus)">
##FORMAT=<ID=RDR,Number=1,Type=Integer,Description="Depth of reference-supporting bases on reverse strand (reads1minus)">
##FORMAT=<ID=ADF,Number=1,Type=Integer,Description="Depth of variant-supporting bases on forward strand (reads2plus)">
##FORMAT=<ID=ADR,Number=1,Type=Integer,Description="Depth of variant-supporting bases on reverse strand (reads2minus)">
ADD REPLYlink written 8 months ago by b.nota3.6k

DP4 field has information about reads that support: reference positive strand, reference negative, alternative positive, alternative negative

But check out documentation if it's in that order

ADD REPLYlink modified 8 months ago • written 8 months ago by stolarek.ir550
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1267 users visited in the last hour