I read the paper by Quake lab about using single cell RNA-seq to find new cell lineage marker in lung development. Their method is to use PCA (principle component analysis) to select genes to do unsupervised hierarchical clustering (HC). Here they described that " Genes with highest loadings in the first four components were analysed by unsupervised hierarchical clustering as well as PCA". I think the loading has an equivalent concept to Eigenvector. Hence, to do the analysis, they generated m×4 matrix (m = gene number,loading matrix?) so, my problem is: how do we choose those genes with highest loadings?
(1) select those genes which has the largest sum of weights (I mean, sum of each row, thus m×1, then order them) or
(2) select those genes which has one of largest weight in either of four columns
The solution is (1) or (2)? or I mis-understand the concept of PCA?
Gene Lists Using Principal Component Analysis In Microarray Gene Expression but I think they described a n×1 loading matrix.
BTW, is there another way to infer the new cell lineage or classify groups of cells? Is there a evaluation report on those methods? TIA