Hi,
I am quite new to rare variant analysis, so I apologize if I have missed something obvious. I have genotype dataframe containing sample IDs and genotype information (0/1/2) for several variants, I also have another dataframe containing sample IDs and their phenotype/covariate information.
How do you make sure that the IDs in the genotype dataframe are properly aligned/merged/matched to their corresponding ID in the phenotype file when I want to test for association between variant burden and a particular phenotype?
Looking at the SKAT example data, it never seems to deal with unique IDs, only using new rows for a new sample. Any help would be amazing.