Question: How to check if first two components of PCA are separated without visualisation?
0
gravatar for fernardo
5 weeks ago by
fernardo 120
Italy
fernardo 120 wrote:

Hi Everyone,

In the following examples we see the second example has better result and more separable.

Is it possible to see somehow find this separation based on the data matrix of PCA result? e.g. based on some kind of score like Mean, Median calculation of the components or any other way?

PCA_example 1:

PCA1

PCA_example 2:

PCA2

The code used:

pca <- prcomp(dataMatrix, scale=T)
scores <- data.frame(Groups, pca$x[,1:3])
pc1.2 <- qplot(x=PC1, y=PC2, data=scores, colour=factor(Groups)) + theme(legend.position="right")

Thanks in advance

python machine learning pca ngs R • 165 views
ADD COMMENTlink modified 5 weeks ago by raunakms1.0k • written 5 weeks ago by fernardo 120
1
gravatar for raunakms
5 weeks ago by
raunakms1.0k
Vancouver, BC, Canada
raunakms1.0k wrote:

First get the PCA eigenvalues of the first two Principal Components (PC1 & PC2) using pca$x[,1:2]. Then calculate in-class distance (i.e. the pairwise distance between the samples belonging to the same class) as well as out-class distance (i.e. the pairwise distance between a sample belonging to the one class and each of the samples in the other class). If the average of the resulting out-class distance is greater than the average of in-class distance, you are most likely to get a distinct clusters of sample groups.

ADD COMMENTlink written 5 weeks ago by raunakms1.0k

Thanks a lot. It seems a solution to me. I am going to try that. But to fully understand your point:

1- in-class distance: do you mean to calculate pairwise correlation between PC1 and PC2 for a condition(class) ?

2- out-class distance: this one related to first point and didn't get it actually.

Thanks

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by fernardo 120
1

Think of the eigenvalues of PC1 and PC2 as x and y coordinates defining each dot in the plot above. The dots highlighted with Red and Blue colors in the plot above are your two sample classes. (1) In-class distance: pair-wise Euclidean distance between each dots highlighted in Red (or Blue). (2) Out-class distance: pair-wise Euclidean distance between a dot highlighted in Red and every dot highlighted in Blue. Repeat this for all dots in Red group. You must follow this procedure for each dot in Blue group vs dots in Red group.

ADD REPLYlink written 4 weeks ago by raunakms1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1569 users visited in the last hour