Consistency of group memberships across variables
1
0
Entering edit mode
6.8 years ago
mforde84 ★ 1.4k

Hi,

I did some similarity network fusion with mRNA and miRNA, and I'm generating a variety of potential clustering options which consist of between 2-5 possible members. I'm interested in testing for membership similarity between multiple categorical variables, in particular those memberships which predict the same number of optimal clusters.

For instance, let's say that two categorical variables with 3 levels have the following membership:

Group1
1
1
2
2
3
3

Group2
2
2
1
1
3
3

I want to test how consistently samples group together across these variables. The name of (1,2,3) is irrelevant and strictly qualitative. In this instance, it would be a perfect match because the 2 matches bidirectionally to 1.

Is there a statistical test that I can apply to test this? I had read that chi square might be appropriate, but I'm still a little fussy on how to interpret it in my application, since I don't think it accounts for the semantic equivalences between 1 and 2 in the different groups.

Any suggestions?

membership • 1.5k views
ADD COMMENT
0
Entering edit mode

Well? Anyone have any suggestions? I mean come now, this isn't stack exchange guys.

ADD REPLY
0
Entering edit mode

The simplistic thing to do is to use a stacked bar plot of your data, and see the grouped distribution. You should code your samples to avoid semantic issues. I don't think any statistic will 'help' you in this matter. At this point your data seems purely based on frequency in a small amount of groups as well as among a small amount of samples...

ADD REPLY
0
Entering edit mode

Sounds reasonable. If I could recode them properly, I could even do a contingency table.

ADD REPLY
1
Entering edit mode
6.8 years ago

Your problem amounts to measuring similarity between sets. There are plenty of similarity measures for sets (e.g. Jaccard index), and you can get a p-value for the overlap between two sets using the hypergeometric distribution.
As for the semantic relationship, only you can tell how to account for it since we have no information on this. The standard way of dealing with semantic relations is through ontologies.

ADD COMMENT

Login before adding your answer.

Traffic: 2177 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6