Error: Bad BCF record - shared section malformed or too short after vcftools filtering
0
0
Entering edit mode
18 months ago
yvanpapa • 0

Hi everyone,

I am trying to use vcftools (0.1.16) to filter a vcf file using the following command:

F1="--max-missing 0.5 --mac 3 --minQ 30 --remove-indels"

vcftools --vcf variants_input.vcf --out variants_output.F1 \$F1 --recode-INFO-all --recode-bcf


With that, I get my output "variants_output.F1.recode.bcf" and this report:

Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">
Warning: Expected at least 2 parts in INFO entry: ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">
After filtering, kept 27 out of 27 Individuals
Outputting BCF file...
After filtering, kept 10473667 out of a possible 598905538 Sites
Run Time = 6576.00 seconds


It seems to me the warnings are not important, and the filtering actually worked given the report. However, when I try to use bcf stats (bcftools 1.9) on my new output, I get a message telling me that the BCF is corrupted or something and that it cannot be stated

bcftools    stats variants_output.F1.recode.bcf > variants_output.F1.recode.bcf.stats.txt
[E::bcf_record_check] Bad BCF record - shared section malformed or too short


Is this caused by the warnings, or something else? I also read somewhere else on the forum that vcftools should not be used anymore but without giving a reason... is it deprecated?

Any help would be greatly appreciated.

vcf bcf filtering SNP • 723 views
2
Entering edit mode

Hello yvanpapa ,

there is no active development in vcftools so I would call it deprecated. Use bcftoolsinstead.

Could you please show the header and the first few variants of your vcf file before you filter? bcftools is very strict about the vcf specification. So we have to make sure you have a valid vcf file.

fin swimmer

0
Entering edit mode

Hi finswimmer,

Thanks a lot for the fast answer.

According to https://github.com/vcftools/vcftools/issues/134 the warning is "just a warning that vcftools doesn't know how to handle the comma within the Description tag. If you remove that comma in the description, the warning will go away. Otherwise, it can generally be ignored." Although I don't find it practical in a routine pipeline to remove manually those comas every time.

I doubt the vcf is not valid because I had no problems using bcftools stats on vcf and bcf files before using vcftools.

I guess I will perform the filtering with bcftools instead. I am pretty new to this but it seems to me it provides the same filtering tools anyway?

1
Entering edit mode

just a warning that vcftools doesn't know how to handle the comma within the Description tag.

What? It is absolutely usual that there are commas in the description tag. If vcftools cannot handle it, that's a bug.

You can use bcftools for filtering nearly everything. Have a look at the several options in bcftools view.

fin swimmer