Question: I can not run bcftools stats (help)
0
gravatar for zion22
20 months ago by
zion2260
merilla
zion2260 wrote:

Hi I would like to make statistics from vcf.gz files using bcftools stats, but when I try to run the following script, it generates files without weight my script is this:

> bcftools stats -F "My_reference_genome.fasta" -s "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

Immediately ran the script it get me the following on the command screen:

About:   Parses VCF or BCF and produces stats which can be plotted using plot-vcfstats.
     When two files are given, the program generates separate stats for intersection
     and the complements. By default only sites are compared, -s/-S must given to include
     also sample columns.
Usage:   bcftools stats [options] <A.vcf.gz> [<B.vcf.gz>]

Options:
        --af-bins <list>               allele frequency bins, a list (0.1,0.5,1) or a file (0.1\n0.5\n1)
        --af-tag <string>              allele frequency tag to use, by default estimated from AN,AC or GT
    -1, --1st-allele-only              include only 1st allele at multiallelic sites
    -c, --collapse <string>            treat as identical records with <snps|indels|both|all|some|none>, see man page for details [none]
    -d, --depth <int,int,int>          depth distribution: min,max,bin size [0,500,1]
    -e, --exclude <expr>               exclude sites for which the expression is true (see man page for details)
    -E, --exons <file.gz>              tab-delimited file with exons for indel frameshifts (chr,from,to; 1-based, inclusive, bgzip compressed)
    -f, --apply-filters <list>         require at least one of the listed FILTER strings (e.g. "PASS,.")
    -F, --fasta-ref <file>             faidx indexed reference sequence file to determine INDEL context
    -i, --include <expr>               select sites for which the expression is true (see man page for details)
    -I, --split-by-ID                  collect stats for sites with ID separately (known vs novel)
    -r, --regions <region>             restrict to comma-separated list of regions
    -R, --regions-file <file>          restrict to regions listed in a file
    -s, --samples <list>               list of samples for sample stats, "-" to include all samples
    -S, --samples-file <file>          file of samples to include
    -t, --targets <region>             similar to -r but streams rather than index-jumps
    -T, --targets-file <file>          similar to -R but streams rather than index-jumps
    -u, --user-tstv <TAG[:min:max:n]>  collect Ts/Tv stats for any tag using the given binning [0:1:100]
        --threads <int>                number of extra decompression threads [0]
    -v, --verbose                      produce verbose per-site and per-sample output

If anyone could help me, I'd be very grateful. thanks

genome • 839 views
ADD COMMENTlink written 20 months ago by zion2260
1

Any reason why you deleted your question, zion22? - I have undeleted it. prasundutta87 went to the trouble of providing an answer and you should respect that.

ADD REPLYlink written 20 months ago by Kevin Blighe68k
1
gravatar for prasundutta87
20 months ago by
prasundutta87390
prasundutta87390 wrote:

'-s' stands for list of samples for sample stats

The command expects sample names and not the VCF file as you have written.

The correct command should be

bcftools stats -F "My_reference_genome.fasta" -s - "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

This would of course give you stats for all your samples

ADD COMMENTlink modified 20 months ago by _r_am31k • written 20 months ago by prasundutta87390
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour