Too low recall for insertion
0
0
Entering edit mode
5 months ago
275237370 • 0

Dear developer.

I am working on measuring the sv calling performance of vg software using nanopore data (80× coverage) and Illumina Platinum Genome data (50× coverage) in rice. It was confirmed that insertion recall in vg was very low score. But, sample test data from vg software is high score. I would be grateful if you could tell me the reason of low recall score.

I used rice reference file and PAV information (from sample1) for constructing graph for vg.

Is there any way to improve it?

Can you peruse the command below?

Use [vg: variation graph tool, version v1.25.0 "Apice"]

Use [toil-vg: version 1.6.2a1]

for i in $(seq 1 12);do vg construct -r ref.fa -v Chr$i.vcf.gz -S -R Chr$i -C -p -f -a -t 48 > Chr$i.vg;done
vg ids -j  for i in $(for i in $(seq 1 12); do echo Chr$i.vg ;done) 
vg index -t 48 -x all.xg $(for i in $(seq 1 12); do echo Chr$i.vg ;done) 
for i in $(seq 1 12);do vg prune -r Chr$i.vg -t 48 > Chr$i.pruned.vg
vg index -g all.gcsa $(for i in $(seq 1 12); do echo Chr$i.pruned.vg; done)
vg map -x all.xg -g all.gcsa -f sample1_1.fastq.gz -f sample1_2.fastq.gz >sample1.aln.gam
vg pack -x all.xg -g sample1.aln.gam  -Q 20 -t 48 -o sample1.pack
vg pack all.xg -k sample1.pack -s sample1 -t 48 >sample1.vcf
toilvg  vcfeval  ./jobStore .   --vcfeval_baseline truth.vcf.gz  --call_vcf sample1.vcf.gz  --sveval --vcfeval_sample sample1 --realTimeLogging --realTimeStderr  --min_sv_len 50 --ins_max_gap 1000

  Result

Recall_INS=0.5085
Recall_DEL=0.9496

Thanks for your help.
Best wishes,

vg • 290 views
ADD COMMENT

Login before adding your answer.

Traffic: 2063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6