How Can I Align A Sequence Against A Pre-Made Alignment?
5
3
Entering edit mode
13.3 years ago
Harry Palmer ▴ 90

Hi all, I am testing the liklihood of various possible trees using phylip or paml. Each time I do this I am aligning one relatively short sequence of interest against a large alignment of mitochondrial genomes. I already have aligned the mitochondrial genomes, so each time this process is run I just want to align my sequence against this pre-made alignment.

How can I do this? I think I've heard about this kind of thing being done but searching now can't find any reference to it whatsoever. Thanks, Harry

multiple clustalw • 7.0k views
ADD COMMENT
5
Entering edit mode
13.3 years ago

Look for 'profile alignment' mode in your MSA tool of choice. Muscle and Clustal have this feature for sure, as does t-coffee.

HTH.

ADD COMMENT
0
Entering edit mode

+1 for muscle solution - this saved me today.

ADD REPLY
0
Entering edit mode

Muscle offers a profile against profile alignment, but I don't see any way of having a bunch of unaligned sequences all aligned against an existing reference. What am I doing wrong ?

ADD REPLY
4
Entering edit mode
12.7 years ago
Andreas ★ 2.5k

Hi,

I know I'm too late so this is just for future reference:

You could try the new version of Clustal (Clustal Omega), which automatically turns already aligned sequences into a HMM which is then used to guide the following alignment process. This is especially useful if you want to align more than one sequence to your profile (i.e. your pre-aligned sequence set) but might be overkill if you just have one sequence.

Andreas

ADD COMMENT
4
Entering edit mode
12.7 years ago

One option is to use PAGAN, which will take your large mitochondrial alignment and, if you have one, its associated tree, as references where you will add new sequences. If you have multiple overlapping short sequences in your dataset, you can also build contigs from the clusters of sequences (reads) that fall into the same place in the reference alignment:

Basic reads alignment options:
  --ref-seqfile arg     reference alignment file (FASTA)
  --ref-treefile arg    reference tree file (NH/NHX)
  --readsfile arg       reads file (FASTA/FASTQ)
  --pair-end            connect paired reads
  --454                 correct homopolymer error
  --use-consensus       use consensus for read ancestors
  --build-contigs       build contigs of read clusters
  --test-every-node     test every node for each read
  --fast-placement      use Exonerate to quickly assign reads to nodes
ADD COMMENT
1
Entering edit mode
13.3 years ago

You could easily do this using HMMER. Simply use hmmbuild to construct a Hidden Markov Model from your pre-made alignment. You can then use hmmalign to align each of your short sequences against the HMM.

ADD COMMENT
0
Entering edit mode

From what I heard, hmmer is not recommended for alignment. Simon Cockell's suggestion is better.

ADD REPLY
1
Entering edit mode
13.3 years ago
Will 4.5k

ClustalW lets you do this too. The is an option for adding sequences to an already calculated alignment.

ADD COMMENT
1
Entering edit mode

I would not recommend using Clustal anymore. I think it is fair to say that it has been surpassed by mafft and muscle both in terms of speed and alignment accuracy.

ADD REPLY
0
Entering edit mode

Thanks, I've got Clustal and Mafft working. As for which one is best, I don't know but I notice they give completely different results which is alarming. Mafft doesn't seem to mind placing gaps all over the place on standard settings (tried mafft, linsi, and einsi modes).

ADD REPLY

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6