Hi, I'm trying to use principal component scores to adjust for population in GWAS. I'm a beginner of principal component analysis using Eigenstrat and have several (basic) questions. I used convertf, smartpca and smarteigenstrat in order.
From the result, I found "statistical significance of differences between populations" section on the log file. What is the usual p-value criteria to tell two populations are different? Is it 0.05? If there are three populations, I got three p-values from three possible pairs of population set. And if p-value from one pair is <0.05 and other two p-values are >0.05. Still, I can tell there's a proof of population difference and should adjust for principal component scores. Am I getting it right?
And smartpca gives me ".pca" file, which contains (by default, 10) principal component scores per sample. Are these PC scores suggested to be used for adjustment for population stratification (using these PC scores as covariates)? If so, how can I determine the number of PC scores to be included for adjustment? Are there any statistics I should refer to?
And some outliers were exclude from the analysis and all PC scores of them are just zeros. Then I should remove these individuals from GWAS or just include them in GWAS with zero values in covariates?
Thanks for your help!