Question: Bcftools stats error: Incorrect allele ("1") at ...
0
gravatar for msf
13 months ago by
msf0
Porto
msf0 wrote:

Hello,

I've been running into an error lately when I try to run bcftools stats but that I believe is probably an error produced in bcftools call.

I have a set of .bcf files generated with samtools mpileup. Original bams were mapped with BWA-mem, treated with AddOrReplaceReadGroups, MarkDuplicates and RealignerTargetCreator.

This is my call for the files; I run them inside a lop

samtools mpileup -Q 20 -q 20 -go %s_raw.bcf -f %s %s

where the second %s is a path to the reference and the third the name of the input bam.

Then I run bcftools call, again inside a loop:

cat chromosome.list | xargs -I {} -n 1 -P 20 sh -c 'bcftools call -cO z -r {} --format-fields GQ -o %s/%s.{}.vcf.gz %s'

where the call would look something like this:

cat chromosome.list | xargs -I {} -n 1 -P 20 sh -c 'bcftools call -cO z -r {} --format-fields GQ -o LER_ALM_1580_raw/LER_ALM_1580_raw.{}.vcf.gz LER_ALM_1580_raw.bcf'

then I concatenate all the chromosome files using bcftools concat.

Finally, I try to run bcftools stats, like so (example from one sample):

bcftools stats -F /scratch/3/msf/Reference/LER_ALM1580/LER_ALM1580.gatk.iteration4.consensus.FINAL.fa -s - LER_ALM_1580_raw.vcf.gz > LER_ALM_1580_raw_v2.vcf.gz.stats

And get this error:

[E::bcf_calc_ac] Incorrect allele ("1") in LER_ALM_1580 at 1:461

If I look it up with bcftools view:

bcftools view -r 1:461 LER_ALM_1580_raw.vcf.gz

This is what I see:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT LER_ALM_1580

1 461 . C . 3.53256 . DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,0,1;MQ=60;FQ=-29.9956 GT:PL:GQ 0/1:29:5

Which means that there is a discordance between the GT field and REF/ALT information.

I have gone back and tried different things with only one sample and one chromosome. I've realized that if I repeat the analysis without using format-fields, the output vcf file is different (REF/ALT and GT are concordant!):

bcftools call -cO z -r 1 -o LER_ALM_1580_raw_chr1.vcf.gz LER_ALM_1580_raw.bcf
bcftools index LER_ALM_1580_raw_chr1.vcf.gz
bcftools stats -F /scratch/3/msf/SpeciesForReference/LER_ALM1580/LER_ALM1580.gatk.iteration4.consensus.FINAL.fa -s - LER_ALM_1580_raw_chr1.vcf.gz > LER_ALM_1580_raw_chr1.vcf.gz.stats
bcftools view -r 1:461 LER_ALM_1580_raw_chr1.vcf.gz

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT LER_ALM_1580 1 461 . C . 3.53256 . DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,0,1;MQ=60;FQ=-29.9956 GT:PL 0/0:29

But if I use --format-fields, then the same as before occurs.

Any idea why this happens? Any advice on how to proceed?

Thank you

ADD COMMENTlink modified 13 months ago • written 13 months ago by msf0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 644 users visited in the last hour