Question: How to reproduce MATLAB tsne('algorithm', 'exact') result in R?
0
gravatar for mk
11 months ago by
mk90
mk90 wrote:

Given:

  1. a 25x25 matrix of integers
  2. an initial 25x3 embedding, generated by PCA
  3. perplexity 2
  4. random seed set to 1
  5. target embedding dimension 3

Run exact t-SNE using MATLAB tsne('algorithm', 'exact') and R Rtsne(theta = 0)

Here is the data "A":

0 0 0 0 0 97 0 0 0 0 0 0 0 93 67 0 0 0 24 63 0 81 69 0 63
0 35 18 12 0 36 0 89 0 15 23 69 0 54 56 36 0 0 0 90 0 0 37 0 12
0 31 17 17 64 80 0 0 0 0 0 0 23 0 0 0 0 69 83 78 94 0 0 93 40
0 0 0 0 0 0 0 0 74 76 0 0 70 31 12 60 92 0 99 16 53 19 0 3 0
0 0 7 85 84 0 0 0 0 0 0 0 0 0 0 0 0 0 64 0 0 0 0 4 33
35 0 77 0 52 0 0 0 0 0 64 70 0 5 0 0 48 93 0 0 92 0 0 2 0
0 0 0 0 0 25 58 0 0 0 46 0 0 88 0 79 0 60 0 23 0 0 81 33 62
5 91 65 0 0 0 0 0 38 72 0 0 75 0 0 0 0 21 48 0 0 32 0 0 0
14 0 40 0 35 0 0 81 94 51 21 55 43 0 0 30 0 77 56 0 0 0 85 0 4
9 0 0 0 0 0 0 0 8 0 86 36 98 0 0 64 0 87 0 0 32 0 88 7 23
0 96 100 0 0 55 0 0 0 0 73 0 0 0 0 0 68 51 0 0 81 0 92 0 0
59 0 93 0 75 12 0 22 0 0 0 0 13 67 0 0 0 67 0 0 71 82 0 0 0
0 0 79 16 35 0 0 0 99 84 89 0 0 26 0 99 0 8 65 81 77 97 0 13 0
0 0 7 97 0 0 0 63 51 29 0 0 0 0 39 38 44 0 0 0 23 0 18 79 76
0 50 0 9 31 0 0 0 57 0 0 61 0 0 0 0 0 95 0 82 35 0 38 85 0
12 0 0 0 0 0 64 0 38 80 0 43 0 26 0 0 0 0 0 0 73 17 39 7 93
25 0 29 0 0 0 31 0 0 73 0 0 0 0 0 8 0 98 0 0 66 0 0 0 61
89 0 29 33 65 72 0 0 18 60 0 0 0 0 63 0 36 0 0 0 0 0 0 0 61
0 0 0 0 22 91 0 0 0 0 49 0 0 0 0 54 7 0 0 0 0 0 0 50 0
0 0 82 51 3 0 0 74 0 0 0 100 57 0 83 0 0 0 0 0 0 0 94 89 0
0 0 65 0 23 33 0 13 5 90 0 0 0 0 0 0 0 81 0 10 0 0 0 5 5
0 0 0 0 56 0 0 0 30 0 0 98 0 78 0 63 0 0 12 42 11 0 0 0 0
0 0 0 72 41 0 0 0 0 53 0 0 19 1 0 0 63 0 0 0 0 11 0 15 0
0 0 4 0 0 31 14 0 0 0 0 85 0 100 0 0 0 70 16 30 98 0 31 0 0
0 0 0 38 0 71 0 85 0 0 53 0 87 0 0 51 59 0 0 0 0 0 0 0 24

Here is the initial embedding "Y":

0.254 -0.440 -0.402
0.131 0.095 0.282
0.331 -0.166 0.242
-0.661 -0.449 -0.110
0.256 -0.522 -0.204
-0.126 0.079 -0.333
0.233 0.314 -0.400
-0.653 0.153 0.194
0.099 -0.149 0.602
0.192 -0.492 0.206
0.178 0.286 0.478
-0.092 0.541 0.012
-0.312 -0.013 0.587
0.252 0.505 -0.313
-0.603 0.079 -0.297
-0.059 0.255 0.493
-0.211 -0.412 0.274
0.569 0.253 0.011
0.158 -0.291 0.471
0.146 0.386 0.058
0.684 0.052 0.038
0.374 -0.120 0.110
-0.217 0.691 0.078
-0.367 0.142 -0.042
-0.155 -0.025 -0.597

Now generate a t-SNE embedding using R's Rtsne(). Notice that "theta = 0" gives the exact algorithm rather than Barnes-Hut:

M = Rtsne(A, perplexity = 1, Y_init = Y, k = 3, max_iter = 1000, dims = 3, pca = FALSE)

18.3306291   -6.2506041 -62.3268525
15.9386942   -3.1369803 -60.5240895
-55.0534150 -176.3234448 -26.9564002
2.8059395  -26.5137681  60.3847045
36.8003669  116.2724964 -44.7099789
-41.8265624   82.2428220   2.4331879
25.7039137  -31.0979402  47.7070508
6.7846280  -24.5332260  59.7495021
17.6887619  -26.3297157  53.9600569
23.9006623  -30.0315023  49.1130745
-43.6656649   83.6832572   5.8279631
-40.6721409   81.3381834   0.2965082
-0.7597657  -28.2349362  60.9845080
32.4991289  120.2894942 -47.4589986
-57.5001125 -176.7817773 -26.9090314
14.4104082  -15.9172413  61.2646084
12.6062537  -19.6219765  59.9587861
39.8651878  119.5524683 -49.4049012
-52.8444181  -65.8360785   8.1726297
12.0636781   -2.5291677 -62.3328198
11.2964251  -21.9339896  59.2367094    
17.6280749   -0.2864139 -56.4086955
36.4835826  118.1396018 -46.6151879
18.4594930    1.1163553 -54.3799843
-50.9437484  -67.2759159   8.9376503

Now we generate an embedding using MATLAB:

M = tsne(A,'Algorithm','exact','NumPCAComponents',0, 'Perplexity', 1, 'InitialY', Y, 'NumDimensions', 3)

1.0e+03 *

0.1524    0.3834   -0.1735
0.1637    0.3731   -0.1781
-0.8293    0.3517    0.1845
0.3783   -0.0525    0.3495
-0.0068   -0.2998   -0.6433
-0.3656   -0.2723    0.1051
0.4675   -0.0788    0.3194
0.3858   -0.0658    0.3431
0.4267   -0.0795    0.3288
0.4579   -0.0790    0.3216
-0.3665   -0.2741    0.0900
-0.3651   -0.2712    0.1151
0.3713   -0.0407    0.3552
-0.0123   -0.2780   -0.6524
-0.8293    0.3517    0.1942
0.3862   -0.1062    0.3291
0.3909   -0.0915    0.3330
0.0137   -0.2877   -0.6476
-1.0500   -0.2379    0.0832
0.1678    0.3622   -0.1670
0.3933   -0.0817    0.3359
0.1718    0.3737   -0.1953
-0.0029   -0.2909   -0.6468
0.1761    0.3740   -0.2044
-1.0499   -0.2378    0.0930

Clearly these embeddings are not equivalent. Given an initial embedding, t-SNE should be repeatable. What am I overlooking?

rtsne R matlab pca tsne • 568 views
ADD COMMENTlink written 11 months ago by mk90
1

t-SNE (t-Distributed Stochastic Neighbor Embedding) starts with random initialization that is why each time when you run the algorithm, you get different results (not exactly same, slightly different). In order to reproduce the results in R, you can use set.seed(x) where x is numeric value. I think it is really hard to get the same results between different programming softwares.

ADD REPLYlink modified 11 months ago • written 11 months ago by arta540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour