Methods to count Number of SNPs and Novel SNPs in a Variant Calling File (.vcf) ?.
1
0
Entering edit mode
8.6 years ago
ravi.uhdnis ▴ 220

Hi,

I am looking for methods to count number of Single Nucleotide Polymorphism (SNPs) in a variant calling file (.vcf). Also, what are the ways to know "Novel SNPs" in a .vcf? I have a .vcf of Human whole genome sequencing sample with 30X coverage, PE sequencing from illumina 2500 platform. Thanks is advance.

snp sequencing genome • 3.2k views
ADD COMMENT
1
Entering edit mode
8.6 years ago

Use GATK VariantAnnotator https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_annotator_VariantAnnotator.php to fill the column ID of the VCF using a set of one or more external public VCF (dbsnp.vcf, exac.vcf, 1000g.vcf, etc...)

Then use 'bcftools filter' or 'gatk VariantFiltration' to exclude the rows having an unknown ID, and count the lines not starting with '#'.

ADD COMMENT

Login before adding your answer.

Traffic: 2824 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6