Question: Vcf Statistics Given A Gff3 Annotation
gravatar for Zev.Kronenberg
6.8 years ago by
United States
Zev.Kronenberg11k wrote:


Anyone know of a tool that summarizes where variants fall within a GFF3 file?

an ideal output would be something like:

number of variants within genes

number of variants within utrs

number of variants in CDS


Otherwise I can munge some code together. Thanks.

gff3 vcf annotation • 3.1k views
ADD COMMENTlink modified 6.3 years ago by Biostar ♦♦ 20 • written 6.8 years ago by Zev.Kronenberg11k
gravatar for Cyriac Kandoth
6.8 years ago by
Cyriac Kandoth5.5k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.5k wrote:

snpEff is a variant effect annotator, which can build its gene transcript database from a GFF3 file using java -jar snpEff.jar build -gff3. See the section named "Building a database from GFF files" at this page. But make sure your GFF3 isn't already one of the pre-built databases listed by java -jar snpEff.jar databases. Run snpEff on the VCF using java -jar snpEff.jar eff to annotate each variant to all possible transcripts in your GFF3. Since you're not interested in variants flanking gene UTRs/CDS/introns, I recommend using the options -no-downstream -no-upstream. But look over the documentation using java -jar snpEff.jar -h to see all your options.

Once you have an annotated VCF, it should be easier to write a wrapper that counts the variants in UTRs, CDS, introns, etc. If you have variants than map to more than one gene/isoform in your GFF3.

ADD COMMENTlink modified 4.6 years ago • written 6.8 years ago by Cyriac Kandoth5.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1019 users visited in the last hour