Question: Genes for clustering
0
gravatar for lirongrossmann
16 months ago by
lirongrossmann20 wrote:

Hi everyone, I have a group of samples which is supposed to be biologically homogeneous. I want to cluster the genes to see which are highly expressed and which are lowly expressed across all samples. I tried hierarchal clustering but it got stuck because there are so many genes. I don’t want to use pca as I want to capture the genes that are uniformly expressed across the samples, not the ones which are most variable. Any suggestions on how to choose the genes to cluster for my purpose? Thanks

ADD COMMENTlink written 16 months ago by lirongrossmann20
1

What's the data ? What do you mean with hierarchical clustering got stuck ?

ADD REPLYlink written 16 months ago by Jean-Karim Heriche18k

The data is rna-seq. There were too many genes and the program is still running. It runs nicely with fewer genes (I have 30,000) Thanks

ADD REPLYlink written 16 months ago by lirongrossmann20

It should not take that much time. However, you can filter non variable genes and hope that reduces number.

ADD REPLYlink written 16 months ago by Puli Chandramouli Reddy150

What's the size of the data, the amount of RAM your computer has and the algorithm you use and its implementation ? I presume the data is a 30000 x p matrix. What's p ? Even for large p, this shouldn't take long unless your computer is underpowered (i.e. not enough RAM) and/or you use a bad/inefficient implementation of the algorithm.

ADD REPLYlink written 16 months ago by Jean-Karim Heriche18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1191 users visited in the last hour