Question: How to prove one set of protein is more conserved than the other set?
0
gravatar for 465186528
9 months ago by
4651865280
4651865280 wrote:

Hi guys~ I have two sets of proteins: let's say A set (contains 100 A genes) and B Set containing (100 B genes). I want to show that genes in A Set are more conserved and less divergent than ones in B Set. I build phylogenetic trees, branch length for A Set is much longer than one from B Set. But it does not seem to be a good way to compare. I also tried network analysis using protein sequence identity, most genes from A set form a big network however genes from B set forms multiple network at the same cut value.

Could someone know a better way to compare using a more quantitive way?

alignment sequence • 274 views
ADD COMMENTlink written 9 months ago by 4651865280
3

Generate all-versus-all pairwise global alignments in set A and calculate mean percent identity with standard deviation. Do the same for the sequences within set B. Then, depending on the distribution of identities apply t-test or Mann-Whitney test to see whether the difference between the two sets is statistically significant.

ADD REPLYlink modified 9 months ago • written 9 months ago by a.zielezinski8.6k

Thanks a lot. this is a feasiable way.

ADD REPLYlink written 9 months ago by 4651865280

Are A all orthologues of one another and likewise for B?

If you got long branch lengths that either means A is the less conserved, or your alignment isn't very good.

You could try dN/dS analyses.

ADD REPLYlink written 9 months ago by jrj.healey12k

Thanks for your reply. A set and B set are from two different protein pfam family. Within the dataset, proteins are similar to each other. For branch length, I agree with you, longer branch does not directly implict the conservation. dN/dS or Ka/Ks is used to show the balance of selection, I guess it can not help compare divergence degree of two different sets of protein.

ADD REPLYlink written 9 months ago by 4651865280

dN/dS would tell you if one group is subject to more drift than the other, which implies less conservation, but it isn't a direct measure I agree.

What I mean by the branch length is (assuming your alignments are OK), you already have your answer - that A is more divergent than B, but it sounds like you are looking for data to confirm a hypothesis you've already decided the answer to...

I don't know why you think that isn't a good comparator?

ADD REPLYlink written 9 months ago by jrj.healey12k

Thanks for your reply. I am sorry that I did not make it clear. I say that branch length of A Set is longer than B set. I mean the scale bar for each tree. Branching length is comparable if I could find a way to compare, do you have any idea about that?

Also for your question, 'you are looking for data to confirm a hypothesis you've already decided the answer to' Yes, I am trying to find something that I already have the answer. Because most proteins from A set share over 50% identity with each other, which can not be found in B set. Thus, I think A set is more conserved. Then I search for a approach to prove it and ask this question on Biostar.~~~

ADD REPLYlink written 9 months ago by 4651865280

But if you know, through some means, that A are over 50% identical, and B are not, and the scale bar on your tree is larger (which mean your branch lengths also should be), then why not use the technique you’ve already apparently used which has already given you the answer?

To say it another way, how do you already know A is more conserved than B before you test it?

ADD REPLYlink written 9 months ago by jrj.healey12k

As you can see, over 50% identity and longer branch are preliminary things that I know. But I am looking for a quantitive way to nicely show the difference. For example, if I just see scale bar is different, it is not strong proof. Reviewers and even I would have questions, for example, if this difference pass the statistic test. I get one possible way to do it, as showed by @a.zielezinski

ADD REPLYlink written 9 months ago by 4651865280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1115 users visited in the last hour