I have WGS data of 118 samples, out of which 72 are from one country, consisting of 11 breeds , that are under consideration. I used others to distinguish/ control. After running the NJ tree from the called SNP data, I ran PCA which separates 72 and 46 (The 46 clearly clusters into 4 groups), but rest of the 72 makes only one scattered cluster. So in the next step, I only take these 72 to make another PCA, resulting in a total of three clusters (1 breed, 1 partial hybrid breed and rest is a scattered mush).
the maf was calculated as 1/2n. Following are the command lines used to produce the PCA:
plink --bfile out.all --keep keep --maf 0.00423 --make-bed --chr-set 29 --out out plink --bfile ./out.all --indep-pairwise 50 5 0.2 --chr-set 29 --out out plink --bfile ./out.all --extract out.all.prune.in --make-bed --chr-set 29
Any help is very much appreciated. Awaiting.