Question: Error: Bad BCF record - shared section malformed or too short after vcftools filtering
0
gravatar for yvanpapa
7 months ago by
yvanpapa0
yvanpapa0 wrote:

Hi everyone,

I am trying to use vcftools (0.1.16) to filter a vcf file using the following command:

F1="--max-missing 0.5 --mac 3 --minQ 30 --remove-indels"

vcftools --vcf variants_input.vcf --out variants_output.F1 $F1 --recode-INFO-all --recode-bcf

With that, I get my output "variants_output.F1.recode.bcf" and this report:

Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes for each ALT allele, in the same order as listed">
Warning: Expected at least 2 parts in INFO entry: ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">
Warning: Expected at least 2 parts in INFO entry: ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">
After filtering, kept 27 out of 27 Individuals
Outputting BCF file...
After filtering, kept 10473667 out of a possible 598905538 Sites
Run Time = 6576.00 seconds

It seems to me the warnings are not important, and the filtering actually worked given the report. However, when I try to use bcf stats (bcftools 1.9) on my new output, I get a message telling me that the BCF is corrupted or something and that it cannot be stated

bcftools    stats variants_output.F1.recode.bcf > variants_output.F1.recode.bcf.stats.txt
[E::bcf_record_check] Bad BCF record - shared section malformed or too short

Is this caused by the warnings, or something else? I also read somewhere else on the forum that vcftools should not be used anymore but without giving a reason... is it deprecated?

Any help would be greatly appreciated.

snp filtering bcf vcf • 368 views
ADD COMMENTlink written 7 months ago by yvanpapa0
2

Hello yvanpapa ,

there is no active development in vcftools so I would call it deprecated. Use bcftoolsinstead.

Could you please show the header and the first few variants of your vcf file before you filter? bcftools is very strict about the vcf specification. So we have to make sure you have a valid vcf file.

fin swimmer

ADD REPLYlink written 7 months ago by finswimmer13k

Hi finswimmer,

Thanks a lot for the fast answer.

According to https://github.com/vcftools/vcftools/issues/134 the warning is "just a warning that vcftools doesn't know how to handle the comma within the Description tag. If you remove that comma in the description, the warning will go away. Otherwise, it can generally be ignored." Although I don't find it practical in a routine pipeline to remove manually those comas every time.

I doubt the vcf is not valid because I had no problems using bcftools stats on vcf and bcf files before using vcftools.

I guess I will perform the filtering with bcftools instead. I am pretty new to this but it seems to me it provides the same filtering tools anyway?

ADD REPLYlink written 7 months ago by yvanpapa0
1

just a warning that vcftools doesn't know how to handle the comma within the Description tag.

What? It is absolutely usual that there are commas in the description tag. If vcftools cannot handle it, that's a bug.

You can use bcftools for filtering nearly everything. Have a look at the several options in bcftools view.

fin swimmer

ADD REPLYlink written 7 months ago by finswimmer13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1482 users visited in the last hour