Question: ConsensusClusterPlus for small sample
0
gravatar for archie
15 months ago by
archie90
India
archie90 wrote:

Hello everyone

I want to identify subgroups in one of cancer dataset. Before using it, I have few questions :

1) What are minimum sample size required to run ConsensusClusterPlus . I have data of 19 samples . Shall I use it for clustering or I should go for other method (eg PCA).

2) While going through manual, I found it will take input data of expression values (normalised or unnormalised ?? ). I also have z score (from expression data) obtained from another analysis for 19 samples. Can I use z score directly and perform clustering using ConsensusClusterPlus software.

I will appreciate all suggestions.

Thanks

consensusclusterplus • 514 views
ADD COMMENTlink modified 12 months ago by chris86330 • written 15 months ago by archie90
0
gravatar for Ahill
15 months ago by
Ahill1.8k
United States
Ahill1.8k wrote:

You can use consensus clustering on 19 samples - there is no intrinsic minimum sample size required. Typically, for this and other clustering methods the results will be very dependent on how you select informative genes - see the ConsensusClusterPlus manual for one gene selection approach of picking the most variable genes by MAD. Data should be normalized. Z scores might be OK, but very dependent on how they were computed (sample-wise, or gene-wise?). If you are using a typical expression readout (like normalized read counts or intensities from RNA-Seq), then using those normalized expression levels (not the Z-scores) for an informative subset of the genes with a Pearson correlation distance measure is probably a good place to start to look for sample groupings.

ADD COMMENTlink written 15 months ago by Ahill1.8k

Hi Ahill

Thanks for your valuable answer.

I will go with MAD approach for variable selection. As z score were predicted sample Wise. First, I will start with normalised intensities and will try to get results. Then will try get subgrouping with z score and later on will see how common results are coming from both approach. I have bit problem in understanding the plots and results. I just run sample data in the clusterconsesusplus and got Following results.

k cluster clusterConsensus

2 1 0.90794831578128

2 2 0.758432628514517

3 1 0.624620046443652

3 2 0.911135863955618

3 3 0.986412256470072

4 1 0.890835574988102

4 2 0.886960582630877

4 3 0.666394932640416

4 4 0.98295225849986

5 1 0.86123474251129

5 2 0.884872156152216

5 3 0.556828374192177

5 4 0.839098318290865

5 5 1

6 1 0.825649752799388

6 2 0.937773728911312

6 3 0.649644539921365

6 4 0.726792776419238

6 5 0.698201730147844

6 6 1

How to decide the k and sample membeship based on the clutserconsensus values. Is there need to fix any threshold and then choose specific k.

Similarly, how to decide the item membership based on this results.

k cluster item itemConsensus

1 2 1 28031 0.5002183

2 2 1 28003 0.4185504

3 2 1 28042 0.4727976

4 2 1 43012 0.5462791

5 2 1 LAL5 0.4682668

6 2 1 08018 0.5090733

7 2 1 57001 0.5897417

8 2 1 22010 0.5834408

9 2 1 01007 0.2090324

10 2 1 01003 0.2036311

Thanks in advance A

ADD REPLYlink modified 15 months ago • written 15 months ago by archie90
0
gravatar for chris86
12 months ago by
chris86330
United Kingdom, London
chris86330 wrote:

I wouldn't bother consensus clustering with really small sample sizes like that because your statistical power is poor, IMO better to use standard hierarchical clustering with aheatmap or complexheatmap.

I usually consensus cluster with large samples sizes like at least over 60. I think the resampling samples method the Monti algorithm uses isn't going to behave well until you have a larger number of samples.

ADD COMMENTlink modified 12 months ago • written 12 months ago by chris86330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2028 users visited in the last hour