Very few snp and indels variation were identified using PAV variation input file base on vg call
0
0
Entering edit mode
10 months ago
Wanglh • 0

Hi all,

We want to find the snp and indels variation from the result vcf file BS_graph_call.vcf by using the pan_genome vg analysis software.

There are only fewer than 20 snp and indels variational lines were found in the result vcf file BS_graph_call.vcf. But when we call snp and indels information base on the reference Ah.genome.fa and query BS.genome.fa by using Mummer software, there were approximately 500,000 highly reliable variant sites were identified.

We want to find the reason that why only a very few snp and indels variation were identified from the final result vcf file BS_graph_call.vcf base on the vg software.

The detail command line were shown belown:

Only PAV( Presence-absence Variation) were saved on the input PAV.vcf.gz file.

vg autoindex --workflow giraffe -r Ah.genome.fa -v PAV.vcf.gz -p Ah -t 100
vg giraffe -Z Ah.giraffe.gbz -m Ah.min -d Ah.dist -f BS_1.fq.gz -f BS_2.fq.gz -t 4 > BS.giraffe.gam
vg pack -x Ah.giraffe.gbz -g BS.giraffe.gam -Q 5 -o BS.pack -t 4
vg call -t 4 Ah.giraffe.gbz -k BS.pack > BS_graph_call.vcf
vg version v1.48.0 "Gallipoli"
Compiled with g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 on Linux
Linked against libstd++ 20210601
Built by ubuntu@ip-172-31-9-38
vg • 478 views
ADD COMMENT

Login before adding your answer.

Traffic: 1619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6