Entering edit mode

10 months ago

leranwangcs
▴
120

Hi,

I'm trying to figure out how each step of Seurat work. But I don't have a deep math background so I have trouble to understand what exactly does FindNeighbors() do and what does FindClusters() do. Any one can provide a simplified explanation on this please? Like after FindNeighbors(), what scores will be given, which will be used in find clusters in what way?

Thanks so much! Leran

`FindNeighbors()`

and`FindClusters()`

are commonly used methods in data analysis and machine learning, particularly in the context of unsupervised learning and clustering of single cell data.FindNeighbors():`FindNeighbors()`

is a function that is used to find the nearest neighbors of your single cell data point within a dataset. It works by calculating the neighborhood overlap (Jaccard index) between every cell and its k. param nearest neighbors. It's often employed in various applications such as anomaly detection, and dimensionality reduction. The concept is that given a data point, you want to identify the closest data points to it based on some similarity metric, such as Euclidean distance or cosine similarity. This helps to identify similar points in the dataset, which can be useful for making predictions or understanding the distribution of the data.FindClusters():`FindClusters()`

is a function used for clustering data points into groups or clusters based on their similarity. It uses a graph-based clustering approach and a Louvain algorithm. Clustering is an unsupervised learning technique where the algorithm groups similar cells together without any predefined labels. The goal is to find patterns and structure in your data. The number of clusters and the algorithm used can vary based on the problem and data characteristics. Common clustering algorithms include K-means, hierarchical clustering, and DBSCAN.Relationship and Working:`FindNeighbors()`

and`FindClusters()`

can be used in conjunction for various single cell data analysis work.`FindClusters()`

,`FindNeighbors()`

is usually used to establish the similarity between data points, which can guide the clustering algorithm's decisions.`FindClusters()`

, as well as for identifying potential outliers or anomalies.In summary, while

`FindNeighbors()`

focuses on finding the nearest neighbors of a single data point,`FindClusters()`

deals with grouping multiple data points into clusters based on their similarities. Both methods complement each other in various scenarios to cluster the data points.