Question: error with vcfutils when building consensus sequence with samtools and bcftools 1.2
gravatar for cheng
5.6 years ago by
cheng20 wrote:

Hi there,

I want to build a consensus sequence for a small region of the genome from an individual in 1000genomes. Here's what I did:

to extract a part of sequence:

samtools view -b 17:7512445-7512545 -o HG00096_1.bam


use mpileup and bcftools to generate vcf file:

samtools mpileup -uf ../../1kg/human_g1k_v37.fasta HG00096_1.bam | bcftools view - -o HG00096_1_w_rg.vcf

[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000

and convert to fastq with vcftuils vcf2fq HG00096_1_w_rg.vcf

now I have errors:

#error output
substr outside of string at /usr/local/bin/ line 557, <> line 393.
Use of uninitialized value in lc at /usr/local/bin/ line 557, <> line 393.
substr outside of string at /usr/local/bin/ line 557, <> line 393.

So I had a look at the vcf file, and found that values for the alt column were always <X>, the info column having no quality, where vcfutils treated as indels. The line of code generated the errors, line 557, looked like:

substr($$seq, $beg, $end - $beg) = lc(substr($$seq, $beg, $end - $beg));

As all rows in the vcf file were treated as indels, the length of $seq didn't increase while $beg took the values of the starting position of all input lines till end of the file. So in such case vcftuils will always crash on this error.

Is the problem with mpileup parameter choices, or si it the way vcfutils designed to behave? Is there any fix to this? It's quite possible that I was missing something very basic because I'm new to this field. Any suggestion would be helpful.


ADD COMMENTlink modified 5.6 years ago by Devon Ryan96k • written 5.6 years ago by cheng20
gravatar for Devon Ryan
5.6 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

After running samtools mpileup, you then need to run bcftools call. It looks like the most recent samtools output gVCF, which is quite helpful in many circumstances but not always needed. After bcftools call, you can just run bcftools consensus. So:

samtools mpileup -vf ../../1kg/human_g1k_v37.fasta HG00096_1.bam | bcftools call -m -O z - > foo.vcf.gz
bcftools index foo.vcf.gz
bcftools consensus -f ../../1kg/human_g1k_v37.fasta foo.vcf.gz
ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Devon Ryan96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1862 users visited in the last hour