**280**wrote:

Hi,

I found 2,302 significant SNPs in genomic sequences representing a few hundred genes sequenced for 2 populations (12 samples / population). Based on the allele counts, I calculated the allele frequencies at each locus. Then, I calculated de allelic frequency divergence (AFD), i.e :

AFD = freq[a1,pop1] - freq[a1,pop2] (allele-freq for pop1 - allele-freq for pop2)

With that, I want to discover which SNPs show a significant difference in allele frequency between both populations in order to identify putative divergent genes (I also performed an outlier test with Bayescan and I also have some problems with that, but it's not the topic of this question!).

I used the Python package "fisher-0.1.4" (http://pypi.python.org/pypi/fisher/) to compute the p-values for each AFD (i.e for each SNP) and I counter-verified these values with R, building a 2x2 matrix for each SNP and performing this test:

`fisher.test(mat, alternative="less")`

I find that R computes the same p-values than Python. Based on these p-values, among the 2,302 SNPs, 56 SNPs show a AFD > 0.5 and a p-value < 0.05. Now, I want to see if those SNPs are also significant when corrected for multiple hypotheses (FDR). I tried to perform a FDR test to compute q-values based on the distribution of p-values (for the 2,302 SNPs). This is where things get complicated...

I tried two R packages: "qvalue" AND "fdrtool". With both of them, I get an error message, which makes me think that there is something wrong with my dataset of p-values. Here is what I did for the package "qvalue" :

```
> p <- scan("~/Desktop/pvalues.txt")
Read 2302 items
>qobj <- qvalue(p, fdr.level=0.05, pi0.method="bootstrap")
[1] "ERROR: The estimated pi0 <= 0. Check that you have valid p-values or use another lambda method."
```

And even when I try to change some parameters in the test, I ALWAYS get this error message.

Then with the package "fdrtool", this is what I did:

```
> fdr <- fdrtool(p, statistic="pvalue")
Step 1... determine cutoff point
Step 2... estimate parameters of null distribution and eta0
Step 3... compute p-values and estimate empirical PDF/CDF
Step 4... compute q-values and local fdr
Step 5... prepare for plotting
Warning message :
Censored sample for null model estimation has only size 2 !
```

With that second package (fdrtool), I can try to visualize the q-values computed with the command:

```
fdr$qval
```

But ALL the q-values are < 0.05, which is quite strange. How can I have p-values > 0.05 that become significant (i.e < 0.05) when I correct for FDR? And how is it possible that ALL the q-values (n=2,302) are significant? I decided to include in my question a graph showing the distribution of my p-values. Maybe this can help...

http://www.freeimagehosting.net/5v76e

**UPDATE 02-06-2012** : max p-value = 0.758 and min p-value = 2.73E-10.

If someone could help me, it would be tremendously appreciated!

Cheers ! FO

**46k**• written 7.1 years ago by Francois Olivier Hébert •

**280**

can you post the min, max of your p-values here? The error seems to indicate that your p-values are < 0.

2.3kSure I can do that ! I just updated my question by giving the min and max p-values. I have some p-values that are really small, but they are all above 0.

280Okay. So I got the error part wrong. Would it be possible for you to upload the p-values somewhere and link here? Or if you're willing to try a multiple-correction using R and multtest package, I can provide you the snippet to accomplish this and see how it turns out..

2.3kProblem solved! Thanks for your time. I think my dataset wasn't of super good quality... so I changed a few things and it works now. Since your first answer, I was concerned with the range of my p-values and it turned out that it was the core of the problem. So thanks again!

280