Question: Extract linear representation of paths in VG graph
gravatar for Ian Fiddes
10 months ago by
Ian Fiddes70
Santa Cruz
Ian Fiddes70 wrote:

I am trying to use VG effectively as a genome compression tool for collections of highly related genomes. A key component of this is that I need to be able to randomly access subregions of an input genome, or path. How can this be accomplished with VG?

So far, as a test case, I have built a graph out of a few sequences, converted it to a sorted XG format file, and have found that I can use vg find -p to select a subset of the graph, but how can I convert this back to a linear sequence?

vg • 355 views
ADD COMMENTlink modified 10 months ago by Jouni Sirén130 • written 10 months ago by Ian Fiddes70
gravatar for Jouni Sirén
10 months ago by
Jouni Sirén130
UCSC Genomics Institute
Jouni Sirén130 wrote:

You can use vg paths -F to extract entire paths in FASTA format. By default, this extracts all paths in the graph. You can use option -p FILE to specify a list of path names in a file and option -Q PREFIX to extract all paths with the given name prefix.

In order to extract subpaths, you can combine vg paths with vg find. An example:

vg find -p 22:30000000-30100000 -x chr22.xg | vg paths -v - -F > output.fa

Option -v tells vg paths that the input is a vg file, and filename - means stdin.

ADD COMMENTlink written 10 months ago by Jouni Sirén130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2220 users visited in the last hour