Question: Compare sequence with MSA consensus
gravatar for rickbeeloo13
4.0 years ago by
rickbeeloo130 wrote:

I have done a multiple sequence alignment for two proteins with similair sequence, however there are some minor differences which discriminate a cytoplasmic form from a mitochondria form. I downloaded the output of the MSA alignment as a consensus sequence and now I want to "blast" this against a genome to find which of these two is probably the version present in the genome.
Which method/tool can be used to compare a sequence with a set of other protein sequences (MSA)?
it's not an option to blast just one cytoplasmic version and one mitochondria version and see which of these gives the best matches, an MSA is/was neccesarry

blast alignment sequence • 1.5k views
ADD COMMENTlink modified 4.0 years ago by ddiez1.9k • written 4.0 years ago by rickbeeloo130
gravatar for ddiez
4.0 years ago by
ddiez1.9k wrote:

Basically you want to construct a model or profile of your two types of proteins and use it to search similar proteins. Using the consensus sequence is not very good for this task.

One option is to use HMMER. Basically, you use your MSA to construct a HMM profile. This profile will capture the positions that are similar. Then, you can use this profile to search for proteins matching the profile in sequence databases. For example, if you have the protein sequences of a target species you can search with the two profiles (one for the mitochondrial and the other for the cytoplasmic). BLAST can perform a similar task with the PSI-BLAST tool.

ADD COMMENTlink written 4.0 years ago by ddiez1.9k

Thankyou! I have one problem using HMMER when I use this command:

hmmbuild --amino --informat "CLUSTAL" "hmmCytopolasm.hmm" "input.clustal"

I get the error:

Error: CLUSTAL is not a recognized input sequence file format

Any idea what I did wrong here?
I can convert it to Stockholm using this converter however I'm not sure how reliable this is

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by rickbeeloo130

What version of HMMER are you using? Might be that you need to specify "Clustal" in lower case? I would try not including that parameter (default is to autodetect).

ADD REPLYlink written 4.0 years ago by ddiez1.9k

I just tried with a small test file and HMMER (3.1b2) guesses correctly the format without specifying it. Also didn't need to use --amino as that is guessed from the input format too. And, it also works with --informat "CLUSTAL". Your problem is that there is a whitespace in CLUSTAL, i.e. you have --informat "CLUSTAL ".

ADD REPLYlink written 4.0 years ago by ddiez1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 973 users visited in the last hour