vg deconstruct with path sizes
2
0
Entering edit mode
3.8 years ago
egoltsman ▴ 10

Hi, I am wondering if there is a way to output snarls with path size information. Currently, if I go the route of 'vg snarls', then 'vg deconstruct', the vcf file contains only the variant sequences, and I am forced to parse those out and calculate the string size for each one, which is not too efficient when you throw a whole pangenome at it. If this information is already available internally during snarl calling, is there a way to extract/output it?

Thanks!

vg • 1.1k views
ADD COMMENT
0
Entering edit mode
3.8 years ago
glenn.hickey ▴ 520

If I understand correctly, you want the length of each allele stored in some kind of VCF Format field? I suppose this is possible, but as far as I know, must VCF parsers would be parsing the alleles into strings in memory anyway which would allow you to get the size just as efficiently.

As mentioned on github, there should soon be an interface to get snarl traversals using a variety of algorithms (including the one used in deconstruct -e) in GAF format. Hopefully that will be more efficient for you to parse.

ADD COMMENT
0
Entering edit mode
3.8 years ago
egoltsman ▴ 10

That's great to know. Thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 1426 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6