Question: I can not run bcftools stats (help)
0
gravatar for zion22
8 weeks ago by
zion220
merilla
zion220 wrote:

Hi I would like to make statistics from vcf.gz files using bcftools stats, but when I try to run the following script, it generates files without weight my script is this:

> bcftools stats -F "My_reference_genome.fasta" -s "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

Immediately ran the script it get me the following on the command screen:

About:   Parses VCF or BCF and produces stats which can be plotted using plot-vcfstats.
     When two files are given, the program generates separate stats for intersection
     and the complements. By default only sites are compared, -s/-S must given to include
     also sample columns.
Usage:   bcftools stats [options] <A.vcf.gz> [<B.vcf.gz>]

Options:
        --af-bins <list>               allele frequency bins, a list (0.1,0.5,1) or a file (0.1\n0.5\n1)
        --af-tag <string>              allele frequency tag to use, by default estimated from AN,AC or GT
    -1, --1st-allele-only              include only 1st allele at multiallelic sites
    -c, --collapse <string>            treat as identical records with <snps|indels|both|all|some|none>, see man page for details [none]
    -d, --depth <int,int,int>          depth distribution: min,max,bin size [0,500,1]
    -e, --exclude <expr>               exclude sites for which the expression is true (see man page for details)
    -E, --exons <file.gz>              tab-delimited file with exons for indel frameshifts (chr,from,to; 1-based, inclusive, bgzip compressed)
    -f, --apply-filters <list>         require at least one of the listed FILTER strings (e.g. "PASS,.")
    -F, --fasta-ref <file>             faidx indexed reference sequence file to determine INDEL context
    -i, --include <expr>               select sites for which the expression is true (see man page for details)
    -I, --split-by-ID                  collect stats for sites with ID separately (known vs novel)
    -r, --regions <region>             restrict to comma-separated list of regions
    -R, --regions-file <file>          restrict to regions listed in a file
    -s, --samples <list>               list of samples for sample stats, "-" to include all samples
    -S, --samples-file <file>          file of samples to include
    -t, --targets <region>             similar to -r but streams rather than index-jumps
    -T, --targets-file <file>          similar to -R but streams rather than index-jumps
    -u, --user-tstv <TAG[:min:max:n]>  collect Ts/Tv stats for any tag using the given binning [0:1:100]
        --threads <int>                number of extra decompression threads [0]
    -v, --verbose                      produce verbose per-site and per-sample output

If anyone could help me, I'd be very grateful. thanks

genome • 146 views
ADD COMMENTlink written 8 weeks ago by zion220
1

Any reason why you deleted your question, zion22? - I have undeleted it. prasundutta87 went to the trouble of providing an answer and you should respect that.

ADD REPLYlink written 8 weeks ago by Kevin Blighe42k
1
gravatar for prasundutta87
8 weeks ago by
prasundutta87330
prasundutta87330 wrote:

'-s' stands for list of samples for sample stats

The command expects sample names and not the VCF file as you have written.

The correct command should be

bcftools stats -F "My_reference_genome.fasta" -s - "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

This would of course give you stats for all your samples

ADD COMMENTlink modified 8 weeks ago by RamRS21k • written 8 weeks ago by prasundutta87330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 625 users visited in the last hour