Question: Assign missing genotype (./. or N/N) for a SNP in vcf file if DP=0
1
gravatar for berge2015
12 months ago by
berge201570
berge201570 wrote:

Hello,

I have a problem with getting the correct genotypes from a samtools mpileup generated vcf file.

In this post: A: ./ Vs "0/0" In Vcf Files, the user says DP=0 causes the genotype to be ./. which is exactly what I want. However, my genotypes are assigned 1/1 if DP=0 for the SNP for any/all individuals.

Am I calling the SNPs in a wrong way? Below are my commands (numbers as prefix for order & clarity):

1. bwa/0.7.4/bin/bwa aln -t 4 Ref.fa sample1.fq > sample1.sai
2. bwa/0.7.4/bin/bwa samse Ref.fa sample1.sai sample1.fq > sample1.sam
3. samtools-0.1.19/samtools view -bS sample1.sam > sample1.bam
4. samtools-0.1.19/samtools sort sample1.bam sample1.sorted
5. samtools-0.1.19/samtools mpileup -BuDI -f Ref.fa sample1.sorted.bam sample2.sorted.bam .. sampleN.sorted.bam | bcftools view -bvcg - > all_samples.bcf
6. samtools-0.1.19/bcftools/bcftools view all_samples.bcf | vcfutils.pl varFilter -D1000 > all_samples.vcf
7. vcftools_0.1.12a/bin/vcf-to-tab < all_samples.vcf > all_sample_SNPs.out

Only when parsing the 'all_sample_SNPs.out' file did I realize that my genotype calls are missing the N/N or NN alleles in entirety. I know my data, so no way the missing alleles are gone at ALL possible locations. So I must be doing something wrong.

Of course it is not vcf-to-tab's fault as the VCF file assigns 1/1 genotype to those with 0 read depth at that particular locus. I'd appreciate it a lot if someone could point towards where I could be messing up.

Alternatively: is there a filter or another way to convert the vcf file so that sites with DP=0 are assigned the missing value of ./. or N/N or any other similar nomenclature?

Thanks!

mpileup snp variant vcf • 538 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by berge201570
1
gravatar for berge2015
12 months ago by
berge201570
berge201570 wrote:

It appears that vcftools can do it. In case someone happens to visit this thread looking for solution to this problem, here's how to do it:

vcftools_0.1.12a/bin/vcftools --vcf in.vcf --out fltrd_out --minGQ 15 --minDP 1 --recode --recode-INFO-all

Found the solution here: A: How to filter vcf file on minimum genotype depth and quality for each sample

ADD COMMENTlink modified 12 months ago • written 12 months ago by berge201570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1434 users visited in the last hour