I have 25 samples from the TCGA, which contain RNA sequencing expression data from 25 different clinical cancer tumor biopsies. I want to cluster them based on similar expression. The problem is that there are no conditions or replicates to build an experiment design to feed into DESeq or edgeR. I also tried things like perls KMeans library, the R built in kmeans() etc. The problem is that for each sample I have 25K expresssion values (25K genes), so the feature vectors are very large and I don't think I am getting anything that is useful.
Does anyone have any advice on clustering data sets with very large feature vectors and/or clustering expression data without biological conditions.