Question

How can I identify regions that are exclusive to some groups in a protein alignment?

2

Entering edit mode

4 months ago

andre.arrudalima ▴ 60

Hi,

I made a protein aligment and then a phylogenetic tree which separated them in two groups. I was wondering if there is any program that gives me which regions were exclusive to each group, in other words, which exclusive regions were responsible for separating the protein in the two groups.

Thank you

protein alignment • 681 views

ADD COMMENT • link 4 months ago by andre.arrudalima ▴ 60

0

Entering edit mode

What I want is basically to be able to see the regions that the alignment algorithms use to group proteins/DNA regions.

ADD REPLY • link 4 months ago by andre.arrudalima ▴ 60

0

Entering edit mode

Not sure exactly what you are looking for. The regions that are homologous are going to be different for each alignment. These regions would become apparent once you do the MSA. One needs to make sure that the sequences one is trying to align share some homology to get meaningful results. Idea here is the clades you see shared a common ancestor in past and over time have changed to the extant alignment you see now.

ADD REPLY • link 4 months ago by GenoMax 142k

0

Entering edit mode

Thank you for your reply. There are conserved sequences, and they share homology, having more or less 70% of similarity. I'm analysing the kinase domain of 10 plant receptors. They are grouped in two major clades. I wish to know a way/program that would give me the main region of the kinase domain responsible for the separtion of these two groups. A more objective analysis than me going through the sequence trying to find it. Do you have any idea?

ADD REPLY • link 4 months ago by andre.arrudalima ▴ 60

score 0 · Answer 1 · 2024-01-02

0

Entering edit mode

4 months ago

Michael 54k

First off, there might not be insertion sequences that are present in only one group. Such sequences would create gaps in the alignment and might have been ignored. If you just want to look for some conserved sequences, open the alignment in a viewer like Jalview or Mega and sort the rows by the tree. Then visually inspect the alignment for conserved domains. Another possibility is to calculate the consensus sequence for both groups. You could also try to extract informative sites, those sites that are not identical across all taxa.

Yet another but possibly more viable approach could be to apply ancestral sequence reconstruction using the tree with FastML and then compare the ancestral sequences for the root of each of the two clades.

ADD COMMENT • link 4 months ago by Michael 54k

0

Entering edit mode

Thank you for your reply. There are conserved sequences. I'm analysing the kinase domain of 10 plant receptors. They are 70% similar, more or less. They are grouped in two major clades. I wish to know a way/program that would give me the main region of the kinase domain responsible for the separtion of these two groups. A more objective analysis than me going through the sequence trying to find it. The site that you reccomended to me is out. But thank you anyway.

ADD REPLY • link 4 months ago by andre.arrudalima ▴ 60