SNP call using bcftools
2
2
Entering edit mode
9.4 years ago
blur ▴ 280

Hi,

I am trying to do SNP call using bcftools, I tried running this in samtools:

samtools mpileup -u -f CBS138.fasta 11_304.sorted.bam | bcftools view -bvcg - > 304.raw.bcf

The samtools part works fine and I get an outfile that looks OK, but the bcf parameters are wrong

(-bvcg doesn't work).

I looked this up online and everywhere I look this is the command line I see -

Can anyone tell me where I went wrong?

Thanks,

bcftools snp samtools • 19k views
ADD COMMENT
0
Entering edit mode

Which version of bcftools are you using?

ADD REPLY
0
Entering edit mode

version 1.2

ADD REPLY
0
Entering edit mode

would you be so kind as to explain what the cmd line means exactly? thank you

ADD REPLY
0
Entering edit mode

Thank you all for your help - I managed to do the first part (mpileup) but when I did index for the bcf file, I got an indexed bcf file, not a vcf file: 304.bcf.csi. No vcf file was created...

ADD REPLY
0
Entering edit mode

Ahh, yes, thanks. I edited command line!

ADD REPLY
0
Entering edit mode

Thanks! this makes more sense :)

After the last command line - the file containing the SNPs in the <out.bcf>?

Is it the same as the first bcf file?

ADD REPLY
0
Entering edit mode

You are welcome! :)

No, they are not the same. Read more here --> What's the difference between mpileup output and bcftools call ?

ADD REPLY
0
Entering edit mode

thank you! you are a saint! :)

ADD REPLY
9
Entering edit mode
9.4 years ago

If you are interested only in SNPs:

samtools mpileup --skip-indels -d 250 -m 1 -E --BCF --output-tags DP,DV,DP4,SP -f <reference genome.fa> -o <output.bcf> <list of input bam files>
bcftools index  <output.bcf> <indexed.bcf>
bcftools call --skip-variants indels --multiallelic-caller --variants-only  -O v <output.bcf> </code><code>-o </code><output.vcf>

--skip-indels (samtools) and --skip-variants indels (bcftools) - report only SNPs

-d 250 - maximum read depth to consider per position

-m 1 - minimum number of gapped reads for an INDEL candidate

-E - recalculate BAQ on the fly, ignore existing BQ tags

--BCF - output is bcf file

--output-tags DP,DV,DP4,SP:

DP (number of reads covering position), DV (number of high-quality variant reads), DP4 (number of forward reference, reverse reference, forward non-reference and reverse non-reference alleles used in variant calling) and SP (phred-scaled strand bias P-value) tags in the output file.

--multiallelic-caller - call only multi-allelic variants

--variants-only - use if you are interested to report only potential variant sites and exclude monomorphic ones (sites without alternate alleles)

Ask, if you have any questions. Hope, it helps you :)

ADD COMMENT
0
Entering edit mode

BTW - there is a small error in the cmd line: -o needs to be before the output file name

bcftools call --skip-variants indels --multiallelic-caller --variants-only  -O v <indexed.bcf>

-o <output.bcf>

ADD REPLY
0
Entering edit mode

Hmm, okey. :)

ADD REPLY
0
Entering edit mode

Is the "DV" indicator technically the number of nucleotide that support that snp?

ADD REPLY
0
Entering edit mode

Yes, technically, DV - is a number of high-quality variant reads that support each of the reported SNP.

ADD REPLY
0
Entering edit mode

I've tried with the above command,but it shows;

[warning] tag DP4 functional, but deprecated. Please switch to ADF and ADR in future. Could not parse tag "AF" in "DP,DV,DP4,SP,AF"

i also need the allele frequency in my vcf file. Thank you.

ADD REPLY
2
Entering edit mode
9.4 years ago

with 1.2 the workflow should involve "bcftools call" http://www.htslib.org/workflow/

samtools mpileup -go <study.bcf> -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam>
bcftools call -vmO z -o <study.vcf.gz> <study.bcf>
ADD COMMENT
0
Entering edit mode

the bcf file is created by the -go parameters?

ADD REPLY

Login before adding your answer.

Traffic: 2525 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6