Alignment to fastg assembly graph
1
1
Entering edit mode
5.0 years ago
grp2009 ▴ 60

Briefly, I am trying to automate the process of aligning a query sequence (from FASTA) to an assembly graph (FASTG from Spades assembly). As output I need the sequences of the paths in the assembly graph corresponding to the alignment(s).

More detail: I have used Spades to assemble the genome from a diploid yeast starting with short reads (WGS sequencing). Using the wonderful program Bandage I am then able to BLAST a certain query sequence against the assembly graph (FASTG file). Because of the diploid nature of the genome, the result looks like this:

image1-path

There are two paths corresponding to this BLAST alignment. In Bandage I can select the nodes corresponding to a path, and then export that path's sequence to a FASTA file.

Doing this manually gives me exactly what I want (essentially, haplotypes derived from the assembled genome). However, I would very much like to automate this process. What tools should I be looking into?

Assembly alignment • 3.8k views
ADD COMMENT
3
Entering edit mode
5.0 years ago
grp2009 ▴ 60

I didn't realize it when I posted the question, but Bandage has a command line mode, including a command querypaths that accomplishes precisely the task in question.

First, find out where the Bandage executable lives. On a Mac, you select the Bandage application, do "Show Package Contents", and the executable is in Contents/MacOS/Bandage. I will just call this executable Bandage.

Then you can run

Bandage assembly_graph.fastq query_sequence.fasta output_prefix

and it will produce output_prefix.tsv with the exactly the paths desired! What a wonderful program. Note that it works with both fastg and gfa formats as input.

ADD COMMENT
0
Entering edit mode

Thank you for providing an answer/closure for the question. It will help someone in future.

ADD REPLY

Login before adding your answer.

Traffic: 1666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6