My colleague intends to perform GWAS (genome-wide association study) on a group of oil palms for the purpose of finding genetic markers associated with variation in some economically important traits, mainly oil yield. The results can then be used for genomic selection.

Here is the current tentative plan:

Out of 192 palms (all progeny of just two parents, one from the Piscifera variety, and the other from Dura.),

two extreme populations (24 with the high oil yields and 24 with the lowest) will go through Genotyping by Sequencing to obtain genotypic data for about a few thousand SNPs for these individuals.

The 48 genotyped palms will then be used as the training population for the GWAS analysis.

My colleague is worried that there may not be enough genetic variation in the study population for the GWAS to be meaningful. The reason my colleague does not intend to use all 192 palms is that genotyping all of them will be very costly. However, if it turns out that he needs to use all 192 palms for the GWAS to work, he will do so.

Should he go through with it? What are the problems or errors that he may encounter?

Any info on how to determine the required size for a GWAS training population will also be highly appreciated.

Thank you very much in advance !

