Question: Help need to use information in an multiple fasta alignment (MFA) file?
0
gravatar for Ian
3 months ago by
Ian5.5k
University of Manchester, UK
Ian5.5k wrote:

Hello.

I have aligned the 88 contigs of an E.coli de novo assembly against the closest reference genome, using WGvista. The aim is to identify structural differences in the alignment, in detail. The output of WGvista is a multiple fasta alignment (MFA) file. The format gives pairwise alignments between the reference (1 big sequence) and contigs of the assembly.

An example of one of the alignments in the files is:

>NC_007946.1 NC_007946:228496-228726 (+)
AGTTTAATTCTTTGAGCATCAAACTTTTAAATTGAAGAGTTTGATCATGGCTCAGATTGA
ACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGCTT
TGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCCGATGGAGGGGGATAA
CTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGG
>FW NODE_5_length_400106_cov_36.4682:399876-400106 (+)
AGTTTAATTCTTTGAGCATCAAACTTTTAAATTGAAGAGTTTGATCATGGCTCAGATTGA
ACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGCTT
TGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCCGATGGAGGGGGATAA
CTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGG
= score = 231  type = M2  L1 = 5065741  L2 = 400106  AL1 = 231  AL2 = 231  P_ID = 100

My problem is that I want to be able to view all of the alignments to the assembly laid out alongside the reference sequence. However, the MFA file treats each alignment as a separate comparison. This may be the wrong approach entirely, so any alternatives are welcome.

alignment fasta • 117 views
ADD COMMENTlink modified 3 months ago by genomax74k • written 3 months ago by Ian5.5k
1

mauve may be a better tool for this application? Have you tried it?

ADD REPLYlink modified 3 months ago • written 3 months ago by genomax74k

Thanks. I have given Mauve a try, but I didn't get a close to what I wanted with WGvista.

ADD REPLYlink written 3 months ago by Ian5.5k

PhyloVISTA appears to be able to use MFA files and should show what you want.

ADD REPLYlink written 3 months ago by genomax74k

Funnily enough I had looked at it because it mentioned the use of MFA files. Unfortunately it assumes that every sequence in the file is a different species, and is a multiple alignment. Whereas my file has a many pair-wise alignments. I rerun MAUVE and the .alignment file is an almost identical format to the output of WGvista. Thanks for you interest!

ADD REPLYlink written 3 months ago by Ian5.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2515 users visited in the last hour