calculation synonymous and non-synonymous allele counts in a VCF file
1
0
Entering edit mode
3.1 years ago
storm1907 ▴ 30

Hi, I wouId like to get absolute numbers of functional and synonymous variants in VCF or tab-delimited txt files - only a raw count of how many coding and non-coding variants I have in a sequenced human exome sample or gnomAD v.3.1 genome file. I was looking for similar threads here, but only found solutions for calculating d_N/d_S ratios, or approaches suitable for model organisms.

Is it possible to do it with Illumina Cloud tools?

Thank you!

VCF • 1.2k views
ADD COMMENT
1
Entering edit mode
3.1 years ago

Almost any VCF annotator (SnpEff, ANNOVAR, VEP,...) will give you the genic annotation you need. They're supposed to be intalled and run locally, being SnpEff the one I'd personally recommend for genic annotation, but VEP's web interface may be all that you need.

ADD COMMENT
0
Entering edit mode

OK, I have already annotated VCFs from VEP. I cannot find such information in there. VEP's web interface outputs only the percentage of variants

ADD REPLY
0
Entering edit mode

aah, ok, I think SnpEff can do the simple thing I need: https://pcingola.github.io/SnpEff/se_outputsummary/

I have two types of VCFs: raw and annotated with VEP. I guess, I need to use raw files for summarizing?

ADD REPLY
0
Entering edit mode

Any VCF provided to SnpEff will be added an ANN field on the INFO column with the SnpEff annotation.

ADD REPLY
0
Entering edit mode

VEP annotation should include the information you need. You may be confused because you are maybe looking for "non-synonimous" term, but you'll never have that since anything that is not synonimous will have a consequence that will be described explicitly, therefore to get the numbers of synonimous vs. non-synonymous you'll have to count synonimous variants in one hand, and all other exonic non-synonimous consequences on the other. Attending to Ensembl's docs, these would be missense variant, inframe insertion, inframe deletion, stop gained, frameshift variant and coding sequence variant.

ADD REPLY

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6