8 weeks ago by

EMBL Heidelberg, Germany

What you refer to as FDG is not a dimensionality reduction method, it's a group of graph layout algorithms, i.e. algorithms to draw graphs that are based on considering edges as spring-like and applying forces to nodes along the edges, typically repulsive forces are applied to all nodes and attractive forces applied to adjacent nodes. Variants result from different choices of forces, e.g. based on electrical, magnetic or inertia properties. Typically this is used when the data is encoded in a similarity matrix that one can view as the adjacency matrix of a graph. In these algorithms, edge weights are taken as the strength of the relation between nodes so the more similar two nodes are, the closer they will be to each other in the drawing which can reveal some cluster structure.

Dimensionality reduction methods do what their name implies, i.e. they try to project the data into a lower dimensional space, the difference essentially lies with the objective function they try to minimize. PCA tries to maximize variance, multi-dimensional scaling tries to preserve point-to-point Euclidean distances, diffusion map preserves the diffusion distance (by simulating heat diffusion on the graph associated with the similarity matrix). For the objective functions behind t-SNE and UMAP, check this UMAP for t-SNE post. Dimensionality reduction aims at getting rid of noisy dimensions in the data so can be useful for revealing clusters. However, applying clustering in the new space should be done with caution as the chosen dimensionality reduction algorithm may not always preserve relevant structures (for example, one can imagine two clusters that could be separated along a variable with low variance, then PCA would probably miss them).