I am new to the field. I have some questions wrt the calculation of the GRM: by using the GCTA software: Let's say I have a population of genome data for N= 4800000, and the original genotype SNPs, and also the imputed SNPs. I want to calculate the GRM for my subsequent analysis, so my questions are:
- Should I use a subset of the SNPs, i.e., the original SNP calls from UKBB, or the full set of imputed genotype SNPs?
- Should I use the whole UKBB samples (N=480000+) or a subsample that I am actually interested in (N=20000+) is enough?
- Should I include the X and XY chromosomes or only use the first 22 autosomal chromosomes?