I am doing QC for a GWAS analysis. I used pc-AIR and pc relate (two Bioconductor Packages) to determine the relatedness and population substructure of my given dataset. I compared it to 1000 genomes data and have a plot comparing the first two PCs in my PCA analysis. In general, what is the best practice for excluding subjects from a study after visually scrutinizing the PC plot. Is there a specific method (ie R package) to use that's considered best practice? or do I arbitrarily decide that base on the graph I want to include a certain set of subjects?
Thanks for your thoughts, in advance.