Question: bcftools doesn't normalize the vcf file
0
gravatar for seta
7 months ago by
seta1.2k
Sweden
seta1.2k wrote:

Hi all,

I'm trying to normalize a vcf file using bcftools (norm -m -any), but it didn't normalize the vcf file without returning any error. Actually, the vcf file was the same before and after normalization! could you please give me any suggestions about the issue, what's the problem?

Thanks

normalize bcftool vcf • 478 views
ADD COMMENTlink modified 7 months ago by harold.smith.tarheel4.4k • written 7 months ago by seta1.2k

Please give us an example of your input and your desired output.

Thanks!

ADD REPLYlink written 7 months ago by finswimmer12k

Hi, sorry for backing late as I had no access to our cluster. There is some rows as below (I just mentioned ref and alt columns, here):

ref alt
A   C,T
A,G C

That I expected after normalization something like below, for example for the first variation:

A   C
A   T

But it didn't change after normalization via bcftools norm -m -any. Could you please tell me what's happened?

ADD REPLYlink modified 6 months ago • written 6 months ago by seta1.2k

Hello again,

so what you are trying is not called "normalizing". You like to split multiallelic sites.

What version of bcftools are you using? For me it works.

input.vcf:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1,length=249250621>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1
chr1    977330  .   T   C,G 225 PASS    .   GT  1/2

Command used:

$ bcftools norm -m -any input.vcf

Output:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1,length=249250621>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_normVersion=1.9+htslib-1.9
##bcftools_normCommand=norm -m -any norm.vcf; Date=Fri Jan 25 13:01:30 2019
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1
chr1    977330  .   T   C   225 PASS    .   GT  1/0
chr1    977330  .   T   G   225 PASS    .   GT  0/1
Lines   total/split/realigned/skipped:  1/1/0/0

I'm using bcftools v1.9.

fin swimmer

ADD REPLYlink written 6 months ago by finswimmer12k

Thank you for your reply. oh, yes multiallelic sites splitting. I'll check the version.

ADD REPLYlink written 6 months ago by seta1.2k

Hi again. I'm using bcftools v1.9, too. so what's the problem in your opinion?

ADD REPLYlink written 6 months ago by seta1.2k

Does my example work? If so please post an example of your data for testing.

ADD REPLYlink written 6 months ago by finswimmer12k

Currently, I have no access to the cluster, I try your example and back to you.

ADD REPLYlink written 6 months ago by seta1.2k
2
gravatar for harold.smith.tarheel
7 months ago by
United States
harold.smith.tarheel4.4k wrote:

From the manual:

normalization will only be applied if the --fasta-ref option is supplied.

ADD COMMENTlink written 7 months ago by harold.smith.tarheel4.4k

Oh, I knew it for normalizing indels. Is it true for splitting the multiallelic snps?

ADD REPLYlink modified 7 months ago • written 7 months ago by seta1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 437 users visited in the last hour