Question: Compute confidence intervals for lambda genomic factor statistics
1
gravatar for jamespower
14 months ago by
jamespower30
jamespower30 wrote:

Hi,

I am computing the lambda statistics in R to check whether the distribution of my p-values follow a null distribution using:

pvalue <- runif(1000, min=0, max=1)
chisq <- qchisq(1 - pvalue, 1)
lambda <- median(chisq) / qchisq(0.5, 1)

Is there a procedure for constructing the confidence interval for this lambda genomic factor in R?

Thanks

chi confidence intervals R • 841 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by jamespower30

Thank you Kevin! I had no idea about the 1.253 to have a median from a mean. I am wondering if this is what is normally done in GWAS studies (or whether maybe Bootstrapping is more appropriate)? It seems reasonable to be able to use this approximation since we would expect a large sample size as you have (1000000).
I will look up the stats as suggested. If there are other opinions about what is normally done for GWAS please share.

Thank you

ADD REPLYlink modified 14 months ago • written 14 months ago by jamespower30

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized. This comment ideally belongs under @Kevin's answer.

ADD REPLYlink written 14 months ago by genomax65k

There is no standard way of doing this, from what I could / can see. Hence the disclaimers at the start and end of my post. I presume that you were asked to do this by a reviewer (possibly a statistician) for a manuscript? Bootstrapping it to obtain bootstrapped CIs would be a good idea. That can be as easy as random sampling the p values and then repeating 250x.

ADD REPLYlink written 14 months ago by Kevin Blighe41k
4
gravatar for Kevin Blighe
14 months ago by
Kevin Blighe41k
London, England
Kevin Blighe41k wrote:

I encourage you to follow up on my points here with a statistician in your department. Always better to get more brains looking at the same thing. There are various ways to calculate confidence intervals for series of data.

You could start with a typical formula for the standard error of the mean (SE mean), which is the standard deviation divided by the square root of the sample n:

σ / sqrt (n)

Lambda is a median value, so, we have to convert the SE mean to SE median:

1.253 * (σ / sqrt (n))

(various articles on the WWW validate this)

As you are dealing with a Chi-squared quantile distribution, in order to calculate the confidence intervals, I believe you can do this with the following:

pvalue <- runif(1000000, min=0, max=1)
chisq <- qchisq(1 - pvalue, 1)
lambda <- median(chisq) / qchisq(0.5, 1)

lambda 
[1] 0.9963626

SE.median <- qchisq(0.975, 1) * (1.253 * ( sd(chisq) / sqrt( length(chisq) ) ) )

SE.median
[1] 0.008914307

lambda + (SE.median / qchisq(0.5, 1))
[1] 1.015957

lambda - (SE.median / qchisq(0.5, 1))
[1] 0.976768

I increased the number of random p-values to 1 million just to make the confidence intervals tighter. At 1000 (your original post), they were pretty wide.

Take this as a first step into this. I encourage you to look up all functions / formulae that I have used.

Kevin

ADD COMMENTlink written 14 months ago by Kevin Blighe41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1485 users visited in the last hour