Rare variant association analysis (SKAT-O) - Power calculation
Hello everyone,

I was wondering if it is statistically acceptable to perform SKAT-O analysis one a single gene.

To elaborate, I am looking in a miRNA gene (80nt length) and I have found a small number of rare variants present in my case group, but none in my control group. Running SKAT-O for adjusted sample size (case+control = 1700 <2000) I get a p.value <0.05.

I was wondering though, is this realistic? I have been trying to make a power calculation in the P-SKAT framework in R but I have been experiencing issues due to subregion.Length, I think.

out.b<-Power_Logistic(SubRegion.Length=100, Causal.Percent=40, alpha=0.05, N.Sim=100 , MaxOR=3, Negative.Percent=20)


On one hand, I am thinking since I am not getting any errors/warnings and I reach a significant p.value, that I should not worry. On the other hand, I am still a bit sceptical.

Any feedback welcome, including any alterations in the code I am using for power calculation (maybe I should eventually include matrices with Haplotypes and SNP.Location?).

association analysis SKAT-O . Genetics
Hi Alex, I don't really get the question - yes, it is possible to apply RVAS analysis to one gene, yes, you can get pval < 0.05 even if you have only several cases and several controls - it all depends on the effect size (difference in proportions) - if 100% of your cases have this gene damages and 0% of controls have this then you don't need thousands. What is so confusing for you?

Hello M.Demidov,

I am only troubled because I am looking on a 80nt miRNA gene and we are talking about 3 ultra rare (MAF <0.01%) variants in 6 patients (no variants in controls). I am not that experienced so I was wondering whether I am skipping something important without knowing. I am aware that the fact I have no variants in controls is really "strong".

there is a huge theory behind all this...

gene discovery is not an easy thing to do. check how many variants were detected in this gene in GnomAD: https://gnomad.broadinstitute.org/ . Also, just a presence of these variants does not mean they/this gene are causal (if it was not proven before you - I assume it was not) - they have to segregate with disease (e.g., if the disease manifested only in a child and parents don't carry this mutaiton - it is a strong piece of evidence). Figure 1 may give a good idea what kind of evidence you need ot collect https://www.nature.com/articles/gim201530 . But it is more, it is a new gene discovery, if I understood correctly, so the evidence collected should be overwhelming. Otherwise no one will trust in the causal role of this gene. Just p-value < 0.05 does not prove anything, this evidence is too weak.

Hello M. Demidov,

Thanks a lot for your explanation! And the information on the article is really rich, thanks a lot! I amchecking for segregation as we speak, but pedigree(s) are not big, unfortunately. Anyhow, thanks again!