About Package ‘clValid’
2
1
Entering edit mode
9.7 years ago
mhorie25 ▴ 10

Hi

I am trying to do "self-organizing tree algorithm" using Package 'clValid'.

I practiced sample data (mouse), and I could do it.

However, I can not extract genes from each cluster. (for example, 45 genes from cluster2 in the following example )

If there are anybody familiar with this package, please teach me how to do it.

Thank you in advance!

Regards,

library(clValid)
data(mouse)
express <- mouse[,c("M1","M2","M3","NC1","NC2","NC3")]
rownames(express) <- mouse$ID
sotaCl <- sota(as.matrix(express), 4)
names(sotaCl)
sotaCl
plot(sotaCl)
plot(sotaCl, cl=2)
rna-seq R gene • 2.8k views
ADD COMMENT
1
Entering edit mode
9.7 years ago
dario.garvan ▴ 530

You should read the documentation for the function sota.

?sota

The result, sotaCl, has an element which has the cluster number of each sample.

sotaCl[["clust"]]
  [1] 1 2 2 1 1 2 2 1 2 2 5 2 1 4 2 1 4 5 2 1 2 1
 [23] 2 2 2 2 4 2 2 2 1 4 2 1 1 2 2 2 1 2 1 5 4 2
 [45] 2 5 2 2 2 2 1 2 5 4 2 1 2 2 1 2 2 4 1 2 2 2
 [67] 1 5 1 3 3 1 3 1 1 5 3 3 1 5 1 4 3 1 1 3 5 1
 [89] 3 1 3 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 5 3 3 3
[111] 3 3 1 3 1 3 5 5 3 3 1 1 4 3 1 3 3 3 1 1 1 1
[133] 1 3 3 3 3 3 3 1 1 3 3 5 1 3 3

To get the genes in each cluster,

split(1:length(sotaCl[["clust"]]), sotaCl[["clust"]])
$`1`
 [1]   1   4   5   8  13  16  20  22  31  34  35
[12]  39  41  51  56  59  63  67  69  72  74  75
[23]  79  81  84  85  88  90 100 113 115 121 122
[34] 125 129 130 131 132 133 140 141 145

$`2`
 [1]  2  3  6  7  9 10 12 15 19 21 23 24 25 26 28
[16] 29 30 33 36 37 38 40 44 45 47 48 49 50 52 55
[31] 57 58 60 61 64 65 66

$`3`
 [1]  70  71  73  77  78  83  86  89  91  92  93
[12]  94  95  96  97  98  99 101 102 103 104 105
[23] 106 108 109 110 111 112 114 116 119 120 124
[34] 126 127 128 134 135 136 137 138 139 142 143
[45] 146 147

$`4`
[1]  14  17  27  32  43  54  62  82 123

$`5`
 [1]  11  18  42  46  53  68  76  80  87 107 117
[12] 118 144
ADD COMMENT
0
Entering edit mode

Dear Dario Garvan

Thank you very much!

ADD REPLY
0
Entering edit mode
5.0 years ago
malina • 0

Some expansion on the code above

#assign the lists of row numbers for different clusters to a variable gene.numbers    
gene.numbers = as.matrix(split(1:length(sotaCl[["clust"]]), sotaCl[["clust"]]))

#Unlist the row nambers for individual clusters ( for example cluster 4)
cluster4 = (unlist(gene.numbers[4])) 

#add gene names (rownames) in additional column to the expression matrix
raw.counts_merged_rep$names <- rownames(raw.counts_merged_rep)

#obtain a table with both expression values and rownames (genes) for cluster 4
cl4_table = filter(raw.counts_merged_rep, row_number() %in% cluster4)
ADD COMMENT
0
Entering edit mode

Can you clarify how this specifically relates to the original question and answer? This is a pretty old thread.

ADD REPLY
0
Entering edit mode

Hi genomax, I hope not. I was working with clValid and sota clustering on Friday. Found this answer very helpful but expanded on it, so it can be directly used to get not only the row.numbers of genes in clusters but also the gene names.

ADD REPLY
0
Entering edit mode

Great, Thanks for the clarification.

ADD REPLY

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6