Summary statistics of VCF annotated files (snpEff)
0
1
Entering edit mode
5.4 years ago
misterie ▴ 110

Hi,

I have 30 annotated VCF files (for each chromosome) in snpEff. I would like to do summary statistics and compare number of differents effect between chromosomes (or between autosomes and sex chromosomes). I have extracted ANN field from my VCF file (using short BASH script). And it is looks like this:

Chr 10
3_prime_UTR_variant 8691
5_prime_UTR_premature_start_codon_gain_variant 517
5_prime_UTR_variant 2904
bidirectional_gene_fusion 2
conservative_inframe_deletion 17
conservative_inframe_insertion 27
conservative_inframe_insertion&splice_region_variant 1
disruptive_inframe_deletion 55
disruptive_inframe_insertion 27
non_coding_transcript_exon_variant 928
non_coding_transcript_variant 113
splice_acceptor_variant&conservative_inframe_deletion&splice_region_variant&intron_variant 2
splice_acceptor_variant&disruptive_inframe_deletion&splice_region_variant&intron_variant 2
...
start_lost 9
stop_gained&conservative_inframe_insertion 1
stop_gained&disruptive_inframe_deletion 1
stop_gained&disruptive_inframe_insertion 1
stop_gained 108
stop_lost 15
stop_lost&splice_region_variant 3
stop_retained_variant 9
synonymous_variant 5038
upstream_gene_variant 98805

There are many types of variants (and also many single variants). I would like to group that variants and compare numbers of coding, introns, flanking sequences etc.

How can I do it? How can I group other variants.

Thank you in advance.

snpEff vcf annotation statistics • 2.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6