Question: Some 'NA's in plink logistic regression results
gravatar for Mel
6.6 years ago by
United Kingdom
Mel30 wrote:

Hi All,

When I perform logistic regression on my data using PLINK, some SNPs (not all SNPs) have NA in OR,SE,L95,U95,STAT and P columns

What does it mean for that SNP? Is it associated?

Thanks in advance

ADD COMMENTlink modified 6.6 years ago by chrchang5237.6k • written 6.6 years ago by Mel30
gravatar for chrchang523
6.6 years ago by
United States
chrchang5237.6k wrote:

This can happen if all your genotypes at a locus are identical (no association analysis is possible then); have you checked whether that's the case?

ADD COMMENTlink written 6.6 years ago by chrchang5237.6k

Hi chrchang523,

Thanks for the reply!

Yes I've check this, I'm just looking at a group of SNPs in particular, when I perform a straight forward -assoc test all these SNPs have minor allele frequencies in either cases, controls or both. The SNPs that have minor allele frequency of 0 in one either case or control, then I'd expect the logistic regression to be NA for these (see table SNP 1). But for some SNPs there is a MAF in cases and controls, and yet the logistic regression is NA (see table SNP 2 and 3)

snpName    MAFcase    MAFcontrol    LogisticRegressionPvalue    OR        L95       U95
1          4.014      0             NA                          NA        NA        NA
2          4.119      0.6536        NA                          NA        NA        NA
3          0.1144     2.288         NA                          NA        NA        NA
4          17.73      30.59         1.71E-02                    0.5943    0.3875    0.9115

Could it be that the MAF is to small for SNP 2 and 3? Like with a Chi squared test needs to see a minimum of 4 observations for the test to work.

ADD REPLYlink modified 12 months ago by _r_am32k • written 6.6 years ago by Mel30

The other two most likely issues are:

  1. Multicollinearity. If you have a covariate (or a linear combination of them) which behaves almost identically to the genotype, the regression doesn't converge to a unique solution.
  2. Random convergence failure, even though mathematically there shouldn't be a problem; the logistic regression algorithm employed by PLINK isn't perfect. It was recently updated, though; you might want to check whether PLINK 1.9 also gives you NAs.
ADD REPLYlink modified 12 months ago by _r_am32k • written 6.6 years ago by chrchang5237.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2167 users visited in the last hour