How to generate Artemis ACT comparison file of multiple sequence alignment
2
0
Entering edit mode
8.9 years ago
rosies • 0

Hi there,

I was wondering how you would generate a comparison file for a single multiple sequence alignment file to be used with Artemis ACT.

Thanks,

Rosie

artemis act comparison file msa • 13k views
ADD COMMENT
0
Entering edit mode
6.0 years ago
dadarotimi ▴ 10

It's a shame that the two known web-based resource (webact and double act) created for generating comparison files are no longer working. You can find your way around it if you have a computer that works in a unix-like environment (MacBook or Linux Ubuntu).

In a terminal, typed the following command:

sudo apt-get install ncbi-blast+ (to downloaded Blast)

makeblastdb -in Genome1 -dbtype nucl

blastn -query Genome2 -db Genome1 -evalue 1 -task megablast -outfmt 6 > Genome1_Genome2.crunch

make sure you have assembled genomes that you want to compare in your working directory

This link might also help: https://katholtlab.files.wordpress.com/2017/07/comparativegenomicstutorialv2.pdf

ADD COMMENT
0
Entering edit mode
6.0 years ago
Joe 21k

I wrote a script to do this some time ago as all the webservers are defunct.

An Artemis Comparison Tool file is simply a tabular blast output file.

NB, contrary to the title and tags of this question, ACT doesn’t handle multiple sequence alignments. It will only do pairwise alignments, thought you can visualise multiple pairs (e.g. A vs B, B vs C, and each pair will need it’s own comparison file.

ADD COMMENT
0
Entering edit mode

Hi all,

I tried to use this script for to compare two fasta files that contain around 60 contigs each. When I input this into ACT I only seem to get similarity information for the first contig? Both fasta files are the same bacterial species so expected to be pretty similar.

Do you have any advice on getting this to work? I just need to know if the two isolates are the same as they were isolated from the same person.

Thanks in advance!

Charlotte

ADD REPLY
1
Entering edit mode

I actually have an updated version of this script which can do all the necessary concatenation for you: https://github.com/jrjhealey/Oread

its still pretty raw, so its not the easiest thing in the world to install at the minute

ADD REPLY
0
Entering edit mode

Thank you very much for your help! This makes so much sense now

ADD REPLY
0
Entering edit mode

Hi Joe, Thanks very much for making the Oread program. I'm having the same issue as Charlotte (above) with only the first node in the fasta files aligning properly. I downloaded the Oread and my file alignment, but I'm till getting the same result. Is there another version of the program that does the concatenating of the nodes in my fasta file? So sorry to bother you! This is my first go at analysing WGS files.
Thanks in advance for your time and any pointers you might send my way!

ADD REPLY
0
Entering edit mode

Oread itself should do concatenation if it detects that it's necessary. The program is still not very mature though so its quite possible there are bugs or similar.

In essence all the tool does is:

  1. Read fasta 1, check if its a multi-fasta
    • If it is, make a concatenated temporary file
  2. Read fasta 2, check if its a multi-fasta
    • If it is, make a concatenated temporary file
  3. Take the input files (if single) or the temporary files, and create a BLAST tabular output (which is the ACT file).

If it still isn't cooperating, please feel free to open an Issue on github with your test data and I'll see if I can figure out what's happening.

ADD REPLY
0
Entering edit mode

Hi, yeah this is a known issue and limitation of ACT. It cannot process 'multiblast files'. I.e. you need contiguous sequences (at least one of them).

IIRC, you can have 1 complete reference sequence, and the other can be a multi fasta, but they cannot both be mulltifastas.

The best way to get around this is reorder your contigs relative to a known (ideally closed) reference (e.g. using progressiveMauve), then artificially concatenate the contigs together so there is only one fasta header.

This wont affect the data/visualisation.

ADD REPLY

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6