I am working on combining multiple VG graphs to one big graph, but I am already having memory space issues with my small graphs when using gcsa indexing. Is there a way to combine gcsa indexes? Just like combining VG graphs?
Have you used vg prune to simplify the graph before doing GCSA indexing? That is usually enough to restrain the memory consumption. Instructions for pruning can be found here: https://github.com/vgteam/vg/wiki/Index-Construction#complex-graph
Yes, I have used vg prune to simplify the graph, but I came to a point where I needed to prune so much it seriously affected my mapping results. That's why I was wondering if there is a way to combine gcsa indeces.
As far as I know, there isn't an easily accessible algorithm to merge GCSA2 indexes.
I'm surprised that pruning created such a problem for you though. In my experience from working on the vg mapping algorithms, it is extremely rare that pruning causes mapping errors. Could you share some details about how this graph was constructed? Also, what pruning parameters were you using?
I actually managed to work around the problem by using vg giraffe. :-)
But it would be interesting to hear some more experienced thoughts on the constructions of my graph. At the moment I am building multiple different graphs based on complete mitochondrial genomes (length: 15000 - 20000bp) of different species belonging to one phylogenetic family. The individual graphs are rather small and I ended up combining them with vg combine. I pruned the combination of the graphs (5 combined) with k = 16 and e = 5.
Do you have any tips for constructing small but diverse graphs?
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy