Question: I can not run bcftools stats (help)
0
gravatar for zion22
8 months ago by
zion2240
merilla
zion2240 wrote:

Hi I would like to make statistics from vcf.gz files using bcftools stats, but when I try to run the following script, it generates files without weight my script is this:

> bcftools stats -F "My_reference_genome.fasta" -s "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

Immediately ran the script it get me the following on the command screen:

About:   Parses VCF or BCF and produces stats which can be plotted using plot-vcfstats.
     When two files are given, the program generates separate stats for intersection
     and the complements. By default only sites are compared, -s/-S must given to include
     also sample columns.
Usage:   bcftools stats [options] <A.vcf.gz> [<B.vcf.gz>]

Options:
        --af-bins <list>               allele frequency bins, a list (0.1,0.5,1) or a file (0.1\n0.5\n1)
        --af-tag <string>              allele frequency tag to use, by default estimated from AN,AC or GT
    -1, --1st-allele-only              include only 1st allele at multiallelic sites
    -c, --collapse <string>            treat as identical records with <snps|indels|both|all|some|none>, see man page for details [none]
    -d, --depth <int,int,int>          depth distribution: min,max,bin size [0,500,1]
    -e, --exclude <expr>               exclude sites for which the expression is true (see man page for details)
    -E, --exons <file.gz>              tab-delimited file with exons for indel frameshifts (chr,from,to; 1-based, inclusive, bgzip compressed)
    -f, --apply-filters <list>         require at least one of the listed FILTER strings (e.g. "PASS,.")
    -F, --fasta-ref <file>             faidx indexed reference sequence file to determine INDEL context
    -i, --include <expr>               select sites for which the expression is true (see man page for details)
    -I, --split-by-ID                  collect stats for sites with ID separately (known vs novel)
    -r, --regions <region>             restrict to comma-separated list of regions
    -R, --regions-file <file>          restrict to regions listed in a file
    -s, --samples <list>               list of samples for sample stats, "-" to include all samples
    -S, --samples-file <file>          file of samples to include
    -t, --targets <region>             similar to -r but streams rather than index-jumps
    -T, --targets-file <file>          similar to -R but streams rather than index-jumps
    -u, --user-tstv <TAG[:min:max:n]>  collect Ts/Tv stats for any tag using the given binning [0:1:100]
        --threads <int>                number of extra decompression threads [0]
    -v, --verbose                      produce verbose per-site and per-sample output

If anyone could help me, I'd be very grateful. thanks

genome • 357 views
ADD COMMENTlink written 8 months ago by zion2240
1

Any reason why you deleted your question, zion22? - I have undeleted it. prasundutta87 went to the trouble of providing an answer and you should respect that.

ADD REPLYlink written 8 months ago by Kevin Blighe51k
1
gravatar for prasundutta87
8 months ago by
prasundutta87360
prasundutta87360 wrote:

'-s' stands for list of samples for sample stats

The command expects sample names and not the VCF file as you have written.

The correct command should be

bcftools stats -F "My_reference_genome.fasta" -s - "My_vcf.gz_file.vcf.gz" > "/T1_.vcf.stats"

This would of course give you stats for all your samples

ADD COMMENTlink modified 8 months ago by RamRS24k • written 8 months ago by prasundutta87360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2378 users visited in the last hour