GWAS - approximate odds ratio and standard error
0
0
Entering edit mode
3.5 years ago

Hello! I'm using ImpG-Summary (Pasaniuc et al, 2014) to perform genotype imputation from summary statistics. However, for each imputed SNP, I only get a z-score, while odds ratio and standard error are missing. The output file has - for each SNP - six columns: name of SNP, position of SNP, Ref allele, Alt allele, Z-score, r2pred; the higher this latter value is, the more confident one is about the result. Can I approximate Odds Ratio, effect size, and standard error somehow? Please let me know. Thanks in advance!

GWAS imputation • 5.1k views
0
Entering edit mode

Hey, you will have to provide more information, as one can only hypothesise about the specifics of your analysis based on the current information that you've provided. For example, which imputation program have you used and in which format is your data currently?

If you are interested in obtaining odds ratios and standard errors (between 2 conditions of interest, I assume?), then the basic association test of PLINK is a good starting point: http://zzz.bwh.harvard.edu/plink/anal.shtml

0
Entering edit mode

0
Entering edit mode

This is an interesting question but I do not believe that you have enough information such that you could calculate what you want. There was a similar question posted ~7 months ago: convert GWAS Zscore back to OR

If you had summary statistics prior to using ImpG-Summary, then what did these summary statistics contain?

0
Entering edit mode

I had summary statistics containing ref allele, alt allele, MAF in cases, MAF in controls, chisq, p-value, OR, L95, U95, ln(OR), SE of ln(OR)

1
Entering edit mode

I would contact the authors. Their emails are at the beginning of the manual: http://bogdan.bioinformatics.ucla.edu/wp-content/uploads/sites/3/2013/07/ImpG_v1.0_User_Manual_31July13.pdf

From the Z-score alone, it may be difficult to calculate the OR - there is too much information missing. The formula is something like:

Z-score = ln(OR) / ln(OR StdErr)


ln is the natural log

OR StdErr is the standard error of the log odds, calculated as:

(ln(OR) - 95% CI of ln(OR)) / 1.96


I may try to work through this formula later on a scrap of paper.

1
Entering edit mode

One of the authors actually states that the imputed Z-score divided by the square root of the sample size yields the effect size (beta) under the standardized scale (i.e. when both phenotype and genotype are standardized to have mean 0 and variance 1). I already contacted them through their github repository: https://github.com/huwenboshi/ImpG/issues issue #5. Similarly in Pasaniuc & Price, 2017 (Nature Reviews Genetics)

0
Entering edit mode

Okay, let me know what they say. I'd be interested.

It looks like we could calculate the ORs from the Z-scores but we would be making a few assumptions along the way without direct evidence.

0
Entering edit mode

One thing I noticed is that in my dataset - being sample size fixed for all SNPs - SE of ln(OR) and allele frequency are well correlated, so that it is easy to approximate SE for various intervals of allele frequency. For example, all SNPs with allele frequency between 0.40 and 0.41 tend to have the same SE. I can thus obtain SE for different ranges of allele frequency starting from typed SNPs and assign it to imputed SNPs. ln(OR) is then easy to compute : z-score*SE. I guess SE is mainly a function of sample size and allele frequency

3
Entering edit mode

There is a paper mentioned conversion of z-statistics back to effect size: https://www.ncbi.nlm.nih.gov/pubmed/27019110. See the online method part. Basically, beta=z/sqrt(2p(1-p)(n+sq(z)) , it requires allele frequency as well as sample size. Hopefully, this might help

0
Entering edit mode

Sure it helps! Thanks a lot!