I have done a PCA using R but I am getting extreme values for the first principal component.
The steps that I performed were:
I had a genotype file coded as: 0, 1, 2 or NA.
I replaced the NAs by 1 (heterozygous) as I transformed the matrix to -1, 0 and 1. So NAs would become zero.
I created the G matrix (VanRaden method) and applied the following command on G matrix:
mypca = prcomp(G, center=TRUE)
When I plot the first and second principal component I notticed huge values for PC1. When I plotted PC2 and PC3 I observed what I was expecting.
Do I need to scale the G matrix? What can be causing those huge values for the PC1? Would the NAs genotypes that I replaced cause this big effect?
Any help would be very much appreciated. Thanks. Paula.