vg paired end short read mapping: Best practice?
1
1
Entering edit mode
20 months ago

I'm trying to map short reads to a genome graph constructed from multiple whole genome alignments of A. thaliana. What is the best practice method to produce the alignments in a reasonable time frame. I've been mapping just a subset of 5000 pairs to a single chromosome graph. vg map takes >24h to complete and enormous resources (up to 500G of RAM on a single core). Mapping a full sequencig run to the full genome graph took over 7 days to complete. Any help would be appreciated. Thanks!!!

mapping short read genome graph vg • 716 views
0
Entering edit mode

1
Entering edit mode

Speed is still an issue with vg map. If your graph is complicated, it can be intractable to map read pairs with vg map. Sometimes the reads can often still be mapped as single-ended reads, although at the expense of mapping rate. The alignment algorithm can also be swapped out for a faster (but less accurate) algorithm with --xdrop-alignment

These days you might have better luck with the vg giraffe mapping tool instead of vg map. vg giraffe tends to be much faster (closer to the speed of bwa mem). However, some of the indexes in vg giraffe can be difficult to build on very complicated graphs.

0
Entering edit mode
20 months ago
xwwang ▴ 20

In my experience, you can use the multiple threads to speed up via the -t option: -t $nthread where$nthread is the number of threads, e.g. 10

In this way, it will be much fast. It took me one day for 200 million paired-end reads to map to a large graph.