Hierarchical clustering tool clustering only rows.
2
0
Entering edit mode
8.2 years ago
orzech_mag ▴ 230

Hi guys!

I would like to analyse my data, that consist of grouped patients in columns and genes with their expression in rows. Thus, I need a tool that does not cluster the columns to maintain the patient's order. Additional thing is that this tool should give me only the best results, not all of them, because the spreadsheet I use is very big. Some suggestions, please?

Best regards!

RNA-Seq software clustering • 3.0k views
ADD COMMENT
1
Entering edit mode

Why don't you just transpose the matrix you have and do clustering?

ADD REPLY
0
Entering edit mode
8.2 years ago

Most clustering algorithms cluster only along one dimension and produce only one solution per run, usually the best one according to the algorithm's criteria. Depending on the implementation, they cluster either rows or columns so you may need to transpose your data matrix first. As a starting point for exploratory data analysis, I usually do hierarchical clustering with average and/or complete linkage. This is available in the R package hclust.

ADD COMMENT
0
Entering edit mode

My spreadsheet is 600 columns x 20500 rows. So, I would like to avoid the transposition, because I am interested only in the best and most interesting shots. I used SparseHierarchicalClustering @ GenePattern (with complete linkage). It returned me only ~ 20-50 the best shots, however the columns were sorted, therefore my groups of patients were mixed. In turn, HierarchicalClustering module @ GP returned all of the rows clustered, so it is hard to see something within 20500 rows. I do not mind using R, but preferentially I would like to avoid the transposition.

ADD REPLY
0
Entering edit mode

I don't understand what you call "best shots". Sparse hierarchical clustering does feature selection to use only a subset of features for the clustering. Assuming that the implementation treats columns as features, you've clustered the genes using only a subset of the patients which are then ranked based on their contribution to the clustering. If you want to cluster the patients, then transpose the matrix as venu suggested in their comment.

ADD REPLY
0
Entering edit mode
8.2 years ago

If you have our Bionumerics software, you can limit the genes to only the polymorphic characters and then perform a clustering on the characters only. This will give you a heatmap according to your groups of patients.

ADD COMMENT

Login before adding your answer.

Traffic: 2010 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6