How to run bcftools roh with multiple samples vcf
2
0
Entering edit mode
5.9 years ago
BAGeno ▴ 190

Hi,

I am trying to run bcftools roh. I searched through this link and found it only mentioned about processing of 1000 genomes files. Bcftools roh calling

My problem is this I did not know how to use this command for my file. I tried to run this command with my file it gave me this error

bcftools roh --AF-file AF.txt -M 100 Input.vcf.gz > roh.txt

It gave me this error.

Missing the option -s, --sample

Can any please tell me how should I run command on multi-sample file?

bcftools roh • 5.5k views
ADD COMMENT
1
Entering edit mode
5.9 years ago
BioinfGuru ★ 1.7k

The process of problem solving a command begins (and usually ends) with 2 steps: 1) View the help file 2) View the online docs

Start with the help function to see the usage and options for roh (works for most programmes) to give you a clue of what to do

bcftools roh --help

Notice the "usage" (how to use the funciton) + list of options:

Usage:   bcftools roh [options] <in.vcf.gz>

# some selected options
--AF-file <file>                   read allele frequencies from file (CHR\tPOS\tREF\tALT\tAF)
-M, --rec-rate <float>             constant recombination rate per bp
-s, --samples <list>               list of samples to analyze [all samples]
-S, --samples-file <file>          file of samples to analyze [all samples]
-o, --output <file>                write output to a file [standard output]

Learn how to understand the individual parts of your command by comparing it to the "Usage" and the other options in the --help output

bcftools roh --AF-file AF.txt -M 100 Input.vcf.gz > roh.txt # Your command
bcftools roh [options] <in.vcf.gz> # usage

Breakdown:

bcftools roh                       # tells the computer to run the roh function in the bcftools programme
--AF-file AF.txt                   # [option]: file containing read allele frequencies
-M 100                               # [option]: constant recombination rate per bp
Input.vcf.gz                        # <in.vcf.gz>: your compressed input vcf file

Now you should be able to conclude what is missing:

Error: Missing the option -s, --sample # so from --help, you are not providing a list of samples

So the next question is: what is "a list of samples"? For that, go find the online docs (just google "bcftools roh docs"):

BCFtools Manual page

Now look through the document for the more comprehensive explanation of how to provide the correct input for -s or -S

So to recap: Use --help, then find the manual for more detail.

ADD COMMENT
0
Entering edit mode

Again I was to slow ;)

fin swimmer

ADD REPLY
0
Entering edit mode

@YaGalbi I already searched through options of bcftools roh and read bcftools manual pages but I did not see any option in bcftools roh manual in which we can specify many samples. Also I tried command with one sample name and it worked just fine. Here are the options when I tried bcftools roh help.

About:   HMM model for detecting runs of autozygosity.
Usage:   bcftools roh [options] <in.vcf.gz>

General Options:
        --AF-tag <TAG>                 use TAG for allele frequency
        --AF-file <file>               read allele frequencies from file (CHR\tPOS\tREF,ALT\tAF)
    -e, --estimate-AF <file>           calculate AC,AN counts on the fly, using either all samples ("-") or samples listed in <file>
    -G, --GTs-only <float>             use GTs, ignore PLs, use <float> for PL of unseen genotypes. Safe value to use is 30 to account for GT errors.
    -I, --skip-indels                  skip indels as their genotypes are enriched for errors
    -m, --genetic-map <file>           genetic map in IMPUTE2 format, single file or mask, where string "{CHROM}" is replaced with chromosome name
    -M, --rec-rate <float>             constant recombination rate per bp
    -r, --regions <region>             restrict to comma-separated list of regions
    -R, --regions-file <file>          restrict to regions listed in a file
    -s, --sample <sample>              sample to analyze
    -t, --targets <region>             similar to -r but streams rather than index-jumps
    -T, --targets-file <file>          similar to -R but streams rather than index-jumps

HMM Options:
    -a, --hw-to-az <float>             P(AZ|HW) transition probability from AZ (autozygous) to HW (Hardy-Weinberg) state [1e-8]
    -H, --az-to-hw <float>             P(HW|AZ) transition probability from HW to AZ state [1e-7]
    -V, --viterbi-training             perform Viterbi training to estimate transition probabilities
ADD REPLY
1
Entering edit mode

Hello,

check the version of your bcftools and update if necessary.

$ bcftools --version
bcftools 1.8
Using htslib 1.8
Copyright (C) 2016 Genome Research Ltd.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

fin swimmer

ADD REPLY
0
Entering edit mode

Thanks I checked and found I have bcftools 1.2

ADD REPLY
1
Entering edit mode

Then you have to update to at least 1.4. That's the first version where -S, --samples-file FILE appears in the manual.

ADD REPLY
0
Entering edit mode
5.9 years ago

Hello,

it seems that bcftools roh doesn't run automatically on all samples within the vcf. You have to specify the names of the samples either by using -s sample_name1,sample_name2 or -S list_with_sample_names.txt.

Have a look into the manual for further informations.

fin swimmer

ADD COMMENT
0
Entering edit mode

I have tried already samples names with comma but it did not worked also I tried to run command with -S option it gave me error of roh: invalid option -- 'S'

ADD REPLY

Login before adding your answer.

Traffic: 1985 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6