error with vcfutils when building consensus sequence with samtools and bcftools 1.2
1
2
Entering edit mode
9.2 years ago
cheng ▴ 20

Hi there,

I want to build a consensus sequence for a small region of the genome from an individual in 1000genomes. Here's what I did:

to extract a part of sequence:

samtools view -b ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00096/alignment/HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam 17:7512445-7512545 -o HG00096_1.bam

use mpileup and bcftools to generate vcf file:

samtools mpileup -uf ../../1kg/human_g1k_v37.fasta HG00096_1.bam | bcftools view - -o HG00096_1_w_rg.vcf

Output:

[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000

and convert to fastq with vcftuils

vcfutils.pl vcf2fq HG00096_1_w_rg.vcf

now I have errors:

error output:

substr outside of string at /usr/local/bin/vcfutils.pl line 557, <> line 393.
Use of uninitialized value in lc at /usr/local/bin/vcfutils.pl line 557, <> line 393.
substr outside of string at /usr/local/bin/vcfutils.pl line 557, <> line 393.

So I had a look at the vcf file, and found that values for the alt column were always <X>, the info column having no quality, where vcfutils treated as indels. The line of code generated the errors, line 557, looked like:

substr($$seq, $beg, $end - $beg) = lc(substr($$seq, $beg, $end - $beg));

As all rows in the vcf file were treated as indels, the length of $seq didn't increase while $beg took the values of the starting position of all input lines till end of the file. So in such case vcftuils will always crash on this error.

Is the problem with mpileup parameter choices, or si it the way vcfutils designed to behave? Is there any fix to this? It's quite possible that I was missing something very basic because I'm new to this field. Any suggestion would be helpful.

Thanks.

vcfutils samtools consensus-sequence mpileup • 5.7k views
ADD COMMENT
0
Entering edit mode

I have multiple vcf files that I combined with gvcf. I want to extract consensus with bcftools but it is not working.

bcftools consensus -i -s jsi5 -f persian_walnut_junglans_regia_chrs_sept18.fasta all_juglans_118_combined.vcf.gz > out.fa

The error is:

[E::hts_open_format] Failed to open file jsi5
Failed to open jsi5: No such file or directory

I highly appreciate it if you can help me.

ADD REPLY
0
Entering edit mode

Do not add answers unless you're answering the top level question. If you have a "I have this question too" post, use Add Comment instead, or open a new post if your question is sufficiently different.

You need to quote jsi5 (like so: 'jsi5') if it's a sample name.

ADD REPLY
1
Entering edit mode
9.2 years ago

After running samtools mpileup, you then need to run bcftools call. It looks like the most recent samtools output gVCF, which is quite helpful in many circumstances but not always needed. After bcftools call, you can just run bcftools consensus. So:

samtools mpileup -vf ../../1kg/human_g1k_v37.fasta HG00096_1.bam | bcftools call -m -O z - > foo.vcf.gz
bcftools index foo.vcf.gz
bcftools consensus -f ../../1kg/human_g1k_v37.fasta foo.vcf.gz
ADD COMMENT

Login before adding your answer.

Traffic: 1988 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6