Question: Forced alignment of multiple proteins against a reference?
gravatar for alex.rubinsteyn
5.7 years ago by
United States
alex.rubinsteyn130 wrote:


I'm interested in aligning a population of related protein sequences against a canonical reference. This doesn't seem to fit into either of the major categories of alignment packages: it's neither pairwise alignment nor multiple sequence alignment. The main problem is that I don't want to allow the aligner to "delete" residues in the reference: every aligned sequence should be the same length as the reference. Perhaps this can be done within existing MSA programs (like clustal and MAFFT) but it's not obvious to me how to do it. Can someone help out? 



protein sequence alignment • 2.4k views
ADD COMMENTlink modified 5.5 years ago by Brice Sarver3.5k • written 5.7 years ago by alex.rubinsteyn130
gravatar for Brice Sarver
5.5 years ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

An alignment is, first and foremost, an inference of homology. If the sequences are not homologous (i.e., if they are different proteins, even the same gene but from different transcripts), it does not make sense to align them.

For the sake of argument, let's assume that they are. If every sequence is the same length as the reference, why do you need to align them? Wouldn't a more appropriate workflow involve confirming that everything is actually the same length and then simply writing out the file because isn't every site homologous across all samples?

If a sequence has bases that are not present in the reference, that means that there was an insertion in that sequence or a deletion in the reference - both introduce gaps to the reference and lengthen it.

If you could clarify your question, I'll be happy to weigh in - I've used just about every alignment program (and sometimes mappers) to accomplish various tasks. However, it seems to me that you're trying to align things that perhaps shouldn't be aligned in the first place.

ADD COMMENTlink written 5.5 years ago by Brice Sarver3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1406 users visited in the last hour