hmm without realigning
2
0
Entering edit mode
2.3 years ago
dbsseven • 0

Hi all, I am trying to generate plausible aligned sequences from a MSA.

My current idea is to use hmmer's hmmbuild to build a hmm model, then hmmemit -> hmmalign to generate an aligned sequence.

However, it seems like hmmbuild is realigning the sequences. Therefore the final aligned sequence does not match my original MSA.

Does anyone have any suggestions? Thanks, David

Example

The starting MSA (using PF00313 from Pfam):

A0A1G7SQH8.1/3-67              -QGF.V..K...W..F.......NA...E......K....G...F..G............F......I........G..........P..........D...........D.........G..........G............E.......D..........V..F..VH..F....S..A.I......E...D..RG.................gF.R..S...L......D.....E.....G.A...R....V..E.....Y..E.ASP.........GQR....G...L...Q.A.D.RVTP-


Build HMM model:

hmmbuild -o test.log -O test_alignment.txt test.hmm PF00313.uniprot


checking the alignment produced by hmmbuild (test_alignment.txt) shows that the alignment has shifted

A0A1G7SQH8.1/3-67
~QGF.V..K...W..F.......NA...E......K....G...F..G............F......I........G..........P..........D...........D.........G..........G............E.......D..........V..F..VH..F....S..A.I......E...D..RG.................gF.R..S...L......D.....E.....G.A...R....V..E.....Y..E.ASP.........GQR....G...L...Q.A.D.RVTP


Emit a sequence gives a gapless sequence:

hmmemit test.hmm


CSD-sample1
IDGTMCTAAATSIFKKTFGFIHQHNLPEDSYKSCTYLVHSSTVEKFLQVVKPAELLCFDVEKVGPYPVGGANALQIRS

alignment hmm • 866 views
1
Entering edit mode

If you want hmmemit to sample an alignment, have you tried hmmemit -a?

This is in the help page and in the documentation:

Options controlling what to emit: -a : emit alignment

0
Entering edit mode

Could you provide the command line you are running and illustrate the issue with an example?

0
Entering edit mode

Added by editing the initial question.

0
Entering edit mode

What do you mean by plausible aligned sequences from the MSA?

If you have an MSA do you not already have aligned sequences to use?

0
Entering edit mode

Yes I have lots of sequences, but I would like new sequences that are not necessarily in the MSA but which fit within the HMM model. (In the same way a trained HMM model can identify unique sequences, not necessarily only those within the training MSA. But now emitting sequences rather than searching.)

1
Entering edit mode
2.2 years ago
h.mon 32k

This is expected and explained in the UserGuide (see page 77):

-O <f> After each model is constructed, resave annotated, possi-
bly modified source alignments to a file <f> in Stockholm
format. The alignments are annotated with a reference an-
notation line indicating which columns were assigned as
consensus, and sequences are annotated with what relative
sequence weights were assigned. Some residues of the align-
ment may have been shifted to accommodate restrictions of
the Plan7 profile architecture, which disallows transitions
between insert and delete states.

0
Entering edit mode

Hi h.mon, I understood that this is expected, but should have been more clear. I was looking for suggestions on an alternative method. Either a way to re-align the emitted sequence against the MSA, or an alternative hmm model builder which would include gaps.

0
Entering edit mode
6 months ago
lagartija ▴ 90

Hi, maybe a bit late but can help others. Just replace de gaps by "N"s and the hmm will keep the msa structure :)