I got 90 individuals genotyped in 3 replicates (genotyping by sequencing), so as a result, my vcf file contains information on 270 samples. The phenotype data were collected for 90 individuals. Then I try to run the mixed linear model in TASSEL and each of 3 genotypes replicates got the same phenotype data. As I understand this is not ok since the phenotype variance getting smaller artificially.
And my problem solution now looks as follows:
- I make the permutations, changing the phenotype data for each of 3 genotype replicates randomly for 500 times and running the MLM to find significant associations on random combinations.
- Then I calculate p-value as follows: I divide the number of iterations with the number of significant SNPs equal to or greater than the number of significant SNPs obtained with using original phenotype-genotype combination by the number of iterations.
I obtain a pretty small p-value, around 0.02 and then also make a binomial test in order to ensure that it is unexpectable to obtain that number of significant SNPs by random chance.
If you ever had a similar situation please share some advice. I also would be grateful if you share your opinion.