Question: How to count variant occurrences in .vcf
0
gravatar for darceyc17
4 days ago by
darceyc170
darceyc170 wrote:

Hello Biostars,

I have done targeted NGS in to discover novel variants associated with a trait of interest. I am currently trying to prioritize those variants from my created .vcf to then genotype them in a larger population. Part of our variant prioritization is determining how many of the subjects, out of the 183 total, have the variant of interest. I am wondering if anyone would know how to go about this, without having to hand count each GT field, or had any suggestions.

Thank you!

snp occurance variant ngs .vcf • 103 views
ADD COMMENTlink modified 4 days ago by colindaven1.0k • written 4 days ago by darceyc170

Does it matter if the genotype is homozygous or heterozygous? Or is the question just "how many sample have at least one allele with this variant"?

ADD REPLYlink written 4 days ago by finswimmer9.8k

It is how many samples have at least one allele with an individual variant, and I have 7,531 variants I need to determine this for.

ADD REPLYlink written 4 days ago by darceyc170

for a start:

 bcftools view input.vcf.gz "chrxxx:12345:12345" |cut -f 10- | tr "\t" "\n" | cut -d ':' -f 1 | sort | uniq -c
ADD REPLYlink written 4 days ago by Pierre Lindenbaum116k
0
gravatar for colindaven
4 days ago by
colindaven1.0k
Hannover Medical School
colindaven1.0k wrote:

If you mean one variant and you're not planning to check your VCF any further, you might want to convert your VCF to a TSV to be able to count more easily.

This tool is available in Galaxy: NGS: VCF Manipulation VCFtoTab-delimited: Convert VCF data into TAB-delimited format

Otherwise, I like the tool vt for working with summarizing vcf statistics.

If just looking at one variant, you can probably use a BED file to specify and extract that exact variant, or even a genome browser such as IGV or JBrowse.

ADD COMMENTlink written 4 days ago by colindaven1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 640 users visited in the last hour