Interpretation question regarding vg call result vcf
1
0
Entering edit mode
6 months ago
minj.9411 • 0

Thank you so much for creating such an amazing tool.

I am currently having difficulty interpreting vcf results from vg call.

I currently performed haplotype-resolved genome assembly and created a graph-genome using both haplotypes (HA and HB) via pggb and vg.

Afterwards, a variant called vcf was completed through vg pack and vg call for resequencing analysis with short-read data of the corresponding genome and accession.

I finally got the variant call vcf result file, but I have some questions.

  1. My vcf result was from all sequences in the entire vg (e.g. both chr1A and chr1B). Do these results indicate the variant was properly called based on the graph-genome?

  2. If question#1 is correct, it seems that the "POS" column is not a segment position of the graph-genome but the position of the fasta sequence itself. Then, is there a way to know where the variant is based on the graph-genome?

Here is my vcf result

chr1A 29983944 <572060<572057 T C 3514.64 PASS AT=<572060<572059<572057,<572060<572058<572057;DP=336 GT:DP:AD:GL:GQ:GP:XD:MAD 1/0:336:110,226:-385.357322,-34.369327,-154.018979:256:-1.098612:550.007812:110chr1A 29985480 <572010<572007 T C 14728.3 PASS AT=<572010<572009<572007,<572010<572008<572007;DP=982 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:982:122,860:-1565.198222,-143.418136,-92.839497:256:-1.098612:948.102356:860 chr1A 29986053 <571990<571987 T G 30064.7 PASS AT=<571990<571989<571987,<571990<571988<571987;DP=1348 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:1348:20,1328:-3018.920121,-371.071811,-12.919387:256:-1.098612:1522.723877:1328 chr1A 29986703 <571967<571964 A G 38343.2 PASS AT=<571967<571965<571964,<571967<571966<571964;DP=1715 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:1715:23,1692:-3910.300659,-538.132709,-76.451191:256:-1.098612:2555.922119:1692 chr1A 29987956 <571925<571923 CTCACAAG C 11280.7 PASS AT=<571925<571924<571923,<571925<571923;DP=569 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:569:1,568:-1213.779980,-254.002113,-86.186735:256:-1.098612:1162.528198:568 chr1B 11418953 >1170327>1170329 CT C 9.70239 lowad AT=>1170327>1170328>1170329,>1170327>1170329;DP=1 GT:DP:AD:GL:GQ:GP:XD:MAD 0/1:1:1,0:-1.058308,-0.737841,-2.735669:3:-1.496158:0.961039:0 chr1B 11428595 >1171654>1171657 T C 9.54243 lowad AT=>1171654>1171656>1171657,>1171654>1171655>1171657;DP=0 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:0:0,0:-0.881267,-0.881267,-0.881267:0:-2.197225:2.029187:0 chr1B 11428597 >1171657>1171660 T C 9.54243 lowad AT=>1171657>1171659>1171660,>1171657>1171658>1171660;DP=0 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:0:0,0:-0.822876,-0.822876,-0.822876:0:-2.197225:1.894737:0 chr1B 11428806 >1171870>1171875 CAAATATACAATATGAAAAGTGAAAGTAAATATAAAATGTGGAACTTTCACTACCCATCTCCAAACATAATTATTG C 9.54243 lowad AT=>1171870>1171871>1171872>1171873>1171874>1171875,>1171870>1171875;DP=0 GT:DP:AD:GL:GQ:GP:XD:MAD 1/1:0:0,0:-36.745333,-36.745333,-36.745333:0:-3.218876:84.609070:0 ...

Additionally, if there is any information that can help me understand the vcf result of the vg call, I would appreciate it if you could let me know.

Regards,

MJ

vg • 470 views
ADD COMMENT
0
Entering edit mode
6 months ago

Most users want the VCF to be expressed against a reference sequence (e.g. GRCh38, T2T-CHM13) rather than to the haplotypes. Assuming that's true, you can get reference-based calls by specifying which of the samples is the reference with -S or by listing the reference paths with -p in vg call.

ADD COMMENT

Login before adding your answer.

Traffic: 2023 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6