Gwas - Permutation Testing?
3
4
Entering edit mode
10.0 years ago

Hi,

I have conducted a search for statistical epistasis using a simple dosage model:

Y ~ A + B + AB


where Y is the phenotype, in this case, gene expression values and A and B are vectors of genotype information for ~500 samples. I wish to determine a signficance threshold using permutation testing in order to correct for multiple testing.

To date, I have recalculated the p-values for the interaction term (AB) for 100 permutations (I permuted the phenotype values) and am unsure how to proceed in order to derive a false discovery rate (FDR).

Any suggestions?

Thanks, D.

gwas snp multiple • 6.8k views
1
Entering edit mode

Couldn't you just use the fdr method of Benjamini&Hochberg, 1995 in R: p.adjust(p, method="fdr")? I think that should also be valid for permutation p-values. Concerns anyone?

1
Entering edit mode

@Michael: For an additive genetic model with genotypes AA AB BB, one assumes that each B or A allele has an incremental effect on the phenotype, such that AA[?]AB>BB. This is intuitively similar to treating each the A or B allele as a drug with increasing dosage. In this case the genotypes are ordinal, not categorical. For a test of epistasis, you are looking for deviation from an additive model and trying to fit an interaction term; I think it's standard to check only the interaction term.

0
Entering edit mode

Some things I don't quite understand: 1. how did you compute your p-values? genotypes are categorial data, how does a dosage model apply then? Why did you compute p-values only on interaction term? Why so few permutations? Given 500! possible permutations of 500 samples, I would have expected more to get a reliable estimate.

1
Entering edit mode
10.0 years ago

A recent study in PLoS Genetics (Liu 2011) may provide some guidance, at least as a warning about how tricky this analysis really is. A large chunk of that paper describes the various nasty sources of false positives the authors discovered. They used a Bonferroni correction.

Your results are going to have complicated covariance, since INTERACT(A,B) and INTERACT(A,C) will be dependent. Have a look at a recent paper from Wing Wong's group (Ma et al. Genetic Epidemiology 2010) which addresses this issue and proposes an adaptive permutation method. I haven't tried to implement their method myself.

0
Entering edit mode
4.1 years ago
hellocita ▴ 30

hi Darren, a paper might help you:enter link description here, it explains how FDR in permutation calculate:)

0
Entering edit mode
4.0 years ago
Bioaln ▴ 350

Once a p-value vector is obtained, you can compute the FDR using e.g.

Python's statsmodels,

http://www.statsmodels.org/devel/stats.html#multiple-tests-and-multiple-comparison-procedures

If that helps.