SNPs not filtered from a GATK vcf file
1
1
Entering edit mode
4.4 years ago
evelyn ▴ 230

Hi All,

I have a vcf file with SNP information from multiple samples made using GATK:

gatk --java-options "-Xmx4G" HaplotypeCaller -R ref.fa -I bams.list -L ch01 -O 01.vcf

Individual vcf's were made chromosome wise and then concatenated:

bcftools concat -o merge.vcf 01.vcf 02.vcf 03.vcf 04.vcf 05.vcf

I want to keep the SNPs only for the final vcf file so I did:

bcftools filter -i 'TYPE="snp"' merge.vcf > merge_SNP.vcf

But the output file still has INDELS. Then I tried using bcftools view for the same job:

bcftools view -v snps merge.vcf > merge_SNP.vcf

The output file again has variants other than SNPs. I am not sure what is going wrong. I will appreciate any suggestions. Thank you!

SNP • 934 views
ADD COMMENT
0
Entering edit mode

You can also use SelectVariants module from GATK

ADD REPLY
0
Entering edit mode
4.4 years ago
inedraylig ▴ 60

The recommended way to filter indels would be to use --exclude types:

bcftools view --exclude-types indels merge.vcf > merge_SNP.vcf

assuming that you only have SNPs and INDELs in your vcf file. bcftools filters using the INFO field, so you can look at your vcf and see where the identity of the call (SNP/INDEL) appears and if it's indeed in the INFO field.

ADD COMMENT

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6