Question: Mummer To Viewable Alignment Format (Fasta Or Aln...)
gravatar for Yannick Wurm
9.4 years ago by
Yannick Wurm2.3k
Queen Mary University London
Yannick Wurm2.3k wrote:


I'm aligning pairs of genomic scaffolds, looking for 1000bp-size insertions/deletions. Its easy to get a dotplot for an overview. It's also easy to find SNPs at a small-scale. But I need to find putative insertions and subsequently confirm them in the lab. Thus I need something I can open in Jalview or something else, where it is easy to see the insertions and to copy-paste the relevant sequences into something for primer design.

I can get such alignments in useable time with clustal (sorry), but not if both sequences are 1MB or bigger. Mummer is super fast, but has a weird output format. Anything else I could use that is fast but gives a user-friendly output? (BLAST would be great, but it divides the aligning bits up into multiple HSPs instead of one long contiguous alignment).

Cheers yannick

alignment genomics • 5.5k views
ADD COMMENTlink written 9.4 years ago by Yannick Wurm2.3k

Would the 'show-aligns' command in Mummer package help? I used that for relative small regions. Btw, from the Mummer output (like 'show-snps') you can get snp/indel positions with flanking regions. I used that for grabbing sequences and pipe into primer3 to get primers.

ADD REPLYlink written 9.4 years ago by Vitis2.4k

Hi Vitis & thanks for the suggestion. Yes, for small inserts it works ... but I'm interested in multiple-kb insertions. Those are split into different "aligns" by mummer...

ADD REPLYlink written 9.4 years ago by Yannick Wurm2.3k
gravatar for ALchEmiXt
9.4 years ago by
The Netherlands
ALchEmiXt1.9k wrote:

You can parse the 'coords' file of MUMmer (which you can get using the --coords option) into a so-called BLAST-crunch file. That file can directly be read into for instance Artemis Comparsion Tool (ACT of Sanger see their website).

An example on the galaxy platform we have posted a while ago. here where you can just grab the perl file.

Basically all it does is to calculate a score. For example:

while (<COORDS>)
    unless ($_ =~ /^(\s*)\d/){next}
    $_ =~ s/\|//g;

    my @f = split;
          # create crude match score = ((length_of_match * %identity)-(length_of_match * (100 - %identity))) /20
    my $crude_plus_score=($f[4]*$f[6]);
    my $crude_minus_score=($f[4]*(100-$f[6]));
    my $crude_score=  int(($crude_plus_score  - $crude_minus_score) / 20);
          # reorganise columns and print crunch format to stdout
          # score        %id   S1    E1    seq1  S2    E2    seq2  (description)
    print OUT " $crude_score $f[6] $f[0] $f[1] $f[7] $f[2] $f[3] $f[8] nucmer comparison coordinates\n"
ADD COMMENTlink written 9.4 years ago by ALchEmiXt1.9k

thanks for the quick reply. can you copy-paste nucleotides out from ACT?

ADD REPLYlink written 9.4 years ago by Yannick Wurm2.3k

yes. The ACT and Artemis suites are meant for annotation/curation onto the single nt level: There is Artemis for single sequence and ArtemisComparisonTool (ACT) for multiple sequences.

ADD REPLYlink written 9.4 years ago by ALchEmiXt1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2122 users visited in the last hour