how to merge gvcf using bcftools
0
1
Entering edit mode
5.1 years ago
evelyn ▴ 230

Hello Everyone,

I created gvcf files using bcftools:

bcftools mpileup -Ov --gvcf 0 -f ref.fa example.sorted.bam | bcftools call -m --gvcf 0 -o example.vcf

I excluded variants other than SNPs using:

bcftools view --exclude-types indels,mnps,bnd,other example.vcf -o example_1.vcf

Then I want to merge gvcf files using bcftools:

bcftools merge --file-list sample.txt -g ref.fa -O v -o merge.vcf

But it is not working. I am getting an error like Failed to merge alleles. I am not sure if I am using correct commands. Thank you so much for your help and time!

snp • 6.0k views
ADD COMMENT
0
Entering edit mode

Failed to merge alleles.

Is this the whole error message? If not, please post also the rest.

Thanks a lot.

fin swimmer

ADD REPLY
0
Entering edit mode

Hello,

I got:

The REF prefixes differ: AGT vs AGAAAGAATGAAAGAATG (3,18)
Failed to merge alleles at ch01:8381 in example_2.vcf.gz

I checked the block with this position in individual gvcf file:

ch01    8277    .   TGT .   .   .   END=8430;MinDP=67   GT:DP   0/0:67

In IGV, this position has the following information,

Total count: 103
A: 96 (93%, 44+, 52- )
C: 0
G: 0
T: 7 (7%, 1+, 6- )
N: 0
ADD REPLY
0
Entering edit mode

Hello again,

could you please try to run bcftools norm on all your individual gvcf files before merging?

bcftools norm -f genome.fa input.gvcf > output.gvcf

You could also run a check if the REF alleles are set correct:

bcftools norm -f genome.fa -c e input.gvcf

fin swimmer

ADD REPLY
0
Entering edit mode

Hello, Thank you! I used your suggestion and it got merged without error. However, I am still trying to understand the call symbols for each sample after merging. Here is a snippet of the merged file:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample_1    sample_2    sample_3
ch01    778 .   A   .   .   .   END=794;MinDP=0;AN=0    GT:DP   ./.:.   ./.:0   ./.:0
ch01    3358    .   C   .   .   .   END=3434;MinDP=0;AN=2   GT:DP   ./.:0   ./.:0   0/0:1
ch01    3435    .   A   G   225 . VDB=0.42042;SGB=-0.692831;MQSB=1;MQ0F=0;MQ=60;RPB=0.4;MQB=1;BQB=0.6;ICB=1;HOB=0.5;DP=36;DP4=0,2,11,18;MinDP=0;AN=4;AC=3   GT:DP:PL    ./.:0:. 1/1:24:255,72,0 0/1:7:183,0,48
ch01    3636    .   A   .   .   .   END=3651;MinDP=0;AN=4   GT:DP   ./.:0   0/0:12  0/0:1

However, I am not able to interpret the symbols here for each sample call. For example, at position 778, for sample_1, there are no reads in IGV so it gave ./.:. but for sample_2, there are reads and calls same as the reference and it gives ./.:0. I could not find any document to understand these symbols so that I can make a better sense of the merged file.

Additionally, for POS 3435, sample_1 has 38 reads same as reference but it calls ./.:0:. I will appreciate your help. Thank you!

ADD REPLY
0
Entering edit mode

Hello evelyn ,

could you please post the vcf lines of the individual gvcf's overlapping the pos 778?

Thanks.

fin swimmer

ADD REPLY
0
Entering edit mode

Hello finswimmer,

Here are the individual lines including POS 778:

ch01    954 .   A   .   .   .   END=3254;MinDP=0    GT:DP   ./.:0
ch01    778 .   A   .   .   .   END=830;MinDP=0 GT:DP   ./.:0
ch01    778 .   A   .   .   .   END=2330;MinDP=0    GT:DP   ./.:0
ADD REPLY

Login before adding your answer.

Traffic: 831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6