Question: ANNOVAR zygosity status in case of multiple sample VCF processing
gravatar for anilkanthi
4.0 years ago by
anilkanthi0 wrote:

After using for annotating a single sample VCF file, it is possible to get zygosity status [Het or Hom] as well as quality and depth by using the -otherinfo parameter. But when a single VCF file contains sibling, trio or multiple-samples, the result is still the same.

Is there any ANNOVAR or other approach that can be used to get the zygosity, read depth and quality of each variant for each of the samples in the resulting excel file?

Desired output:

In case of trio [Het:Het:Het or Hom:Het:Het or Het:n/a:n/a]

In case of siblings [Het:Het or Hom:Het or Het:/na]

annovar wes ngs annotation vcf • 1.7k views
ADD COMMENTlink modified 3.9 years ago by Biostar ♦♦ 20 • written 4.0 years ago by anilkanthi0
gravatar for Kevin Blighe
23 months ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

You may find this of utility: A: How to get sample names and genotype for SNP in multi-sample VCF file (adapt as needed).

Regarding ANNOVAR, specifically, take a look at the options:

Usage: [arguments] <variantfile>

     Optional arguments:
            -h, --help                      print help message
            -m, --man                       print complete documentation
            -v, --verbose                   use verbose output
                --format <string>           input format (default: pileup)
                --outfile <file>            output file name (default: STDOUT)
                --snpqual <float>           quality score threshold in pileup file (default: 20)
                --snppvalue <float>         SNP P-value threshold in GFF3-SOLiD file (default: 1)
                --coverage <int>            read coverage threshold in pileup file (default: 0)
                --maxcoverage <int>         maximum coverage threshold (default: none)
                --includeinfo               include supporting information in output
                --chr <string>              specify the chromosome (for CASAVA format)
                --chrmt <string>            chr identifier for mitochondria (default: M)
                --altcov <int>              alternative allele coverage threshold (for pileup format)
                --allelicfrac               print out allelic fraction rather than het/hom status (for pileup format)
                --fraction <float>          minimum allelic fraction to claim a mutation (for pileup format)
                --species <string>          if human, convert chr23/24/25 to X/Y/M (for gff3-solid format)
                --filter <string>           output variants with this filter (case insensitive, for vcf4 format)
                --allsample                 process all samples in file with separate output files (for vcf4 format)
                --withzyg                   print zygosity/coverage/quality when -includeinfo is used (for vcf4 format)
                --genoqual <float>          genotype quality score threshold (for vcf4 format)
                --varqual <float>           variant quality score threshold (for vcf4 format)
                --comment                   keep comment line in output (for vcf4 format)
                --dbsnpfile <file>          dbSNP file in UCSC format (for rsid format)
                --withfreq                  for --allsample, print frequency information instead (for vcf4 format)
                --seqdir <string>           directory with FASTA sequences (for region format)
                --inssize <int>             insertion size (for region format)
                --delsize <int>             deletion size (for region format)
                --subsize <int>             substitution size (default: 1, for region format)
                --context <int>             print context nucleotide for indels (for casava format)

     Function: convert variant call file generated from various software programs 
     into ANNOVAR input format

     Example: -format pileup -outfile variant.query variant.pileup
     -format cg -outfile variant.query
     -format cgmastervar variant.masterVar.txt
     -format gff3-solid -outfile variant.query variant.snp.gff
     -format soap variant.snp > variant.avinput
     -format maq variant.snp > variant.avinput
     -format casava -chr 1 variant.snp > variant.avinput
     -format vcf4 variantfile > variant.avinput
     -format vcf4 -filter pass variantfile -allsample -outfile variant
     -format vcf4old input.vcf > output.avinput
     -format rsid snplist.txt -dbsnpfile snp138.txt > output.avinput
     -format region -seqdir humandb/hg19_seq/ chr1:2000001-2000003 -inssize 1 -delsize 2
     -format transcript NM_022162 -gene humandb/hg19_refGene.txt -seqdir humandb/hg19_seq/

     Version: $Date: 2015-06-17 21:43:51 -0700 (Wed, 17 Jun 2015) $


ADD COMMENTlink modified 23 months ago • written 23 months ago by Kevin Blighe63k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 744 users visited in the last hour