How to us Machine Learning to understand cell heterogeneity using single cell genomic data? What type of data analysis and prediction/association do I need to perform in order to extract important informations about cell heterogeneity?
For supervised ML, one needs labeled data. That means we know ahead of time how each data point is classified, and the computer learns patterns from that dataset. Do you have labeled data, and a good number of samples?
If the answer to my question is no, then you are presumably looking to do unsupervised ML, which is most often the case with scRNA analyses. We can think of all UMAP and t-SNE analyses as unsupervised ML, where data heterogeneity is deduced from apparent clusters in 2D/3D plots (after reducing the dimensionality of original data). Unless you have labeled data and you are trying to teach a computer how to recognize a specific cell pattern, most likely you will end up doing what everyone else does with scRNA data.