Combining gcsa indeces for vg
1
1
Entering edit mode
4 months ago
Nicola ▴ 10

Hi,

I am working on combining multiple VG graphs to one big graph, but I am already having memory space issues with my small graphs when using gcsa indexing. Is there a way to combine gcsa indexes? Just like combining VG graphs?

VG vgteam gcsa • 388 views
1
Entering edit mode
3 months ago

Have you used vg prune to simplify the graph before doing GCSA indexing? That is usually enough to restrain the memory consumption. Instructions for pruning can be found here: https://github.com/vgteam/vg/wiki/Index-Construction#complex-graph

0
Entering edit mode

Yes, I have used vg prune to simplify the graph, but I came to a point where I needed to prune so much it seriously affected my mapping results. That's why I was wondering if there is a way to combine gcsa indeces.

0
Entering edit mode

As far as I know, there isn't an easily accessible algorithm to merge GCSA2 indexes.

I'm surprised that pruning created such a problem for you though. In my experience from working on the vg mapping algorithms, it is extremely rare that pruning causes mapping errors. Could you share some details about how this graph was constructed? Also, what pruning parameters were you using?

0
Entering edit mode

I actually managed to work around the problem by using vg giraffe. :-)

But it would be interesting to hear some more experienced thoughts on the constructions of my graph. At the moment I am building multiple different graphs based on complete mitochondrial genomes (length: 15000 - 20000bp) of different species belonging to one phylogenetic family. The individual graphs are rather small and I ended up combining them with vg combine. I pruned the combination of the graphs (5 combined) with k = 16 and e = 5.

Do you have any tips for constructing small but diverse graphs?