Hi All.
I'm using lmekin (http://cran.r-project.org/web/packages/coxme/coxme.pdf) to evaluate a linear mixed model that takes into account the samples' kinship -- in fact my dataset includes related individuals. The model I'm using is quite simple:
Transcript_level ~ covariate + (1|famID)
and I'm passing the kinship matrix, created with kinship
(R package kinship2), using the varlist
parameter. Specifically, the code I'm using is:
model <- lmekin( Transcript_level ~ covariate + (1|famID), data=df, varlist=kinship.matrix))
Unfortunately the evaluation fails because the Cholesky factorisation can't be performed:
Warning message:
In .local(x, ...) :
Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 431
Error in t(solve(chol(as(vmat, "dsCMatrix"), pivot = FALSE))) :
error in evaluating the argument 'x' in selecting a method for function 't': Error in solve(chol(as(vmat, "dsCMatrix"), pivot = FALSE)) :
error in evaluating the argument 'a' in selecting a method for function 'solve': Error in .local(x, ...) :
internal_chm_factor: Cholesky factorization failed
I have tried to evaluate the Cholesky decomposition of my kinship matrix using
chol(as(kinship.matrix, "dsCMatrix"), pivot = FALSE)
as highlighted in the code and it works. However, when using the function Matrix::Cholesky
, I obtain:
Error in Cholesky(kinship.matrix) :
internal_chm_factor: Cholesky factorization failed
In addition: Warning message:
In Cholesky(mykin) :
Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 431
that it seems to me the same error raised by lmekin. Does anyone has any suggestions?
Thanks in advance for your help,
Alessia
Providing a small example of my data is a bit tricky, but, if necessary, I can try.
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] splines grDevices utils datasets graphics stats methods
[8] base
other attached packages:
[1] coxme_2.2-3 nlme_3.1-118 bdsmatrix_1.3-2 survival_2.37-7
[5] Matrix_1.1-4 Rdym_0.1.0 reshape_0.8.5 ggplot2_1.0.0
loaded via a namespace (and not attached):
[1] colorspace_1.2-4 digest_0.6.4 grid_3.1.1 gtable_0.1.2
[5] lattice_0.20-29 MASS_7.3-35 munsell_0.4.2 plyr_1.8.1
[9] proto_0.3-10 Rcpp_0.11.3 reshape2_1.4 scales_0.2.4
[13] stringr_0.6.2 tools_3.1.1
My kinship matrix is a real symmetric non positive-definite matrix. Nevertheless,
chol(kinship.matrix)
works -- this is something I have never understood, so if someone can help me also with this I will really appreciate!Summarising
kinship.matrix
is real, symmetric, normal and hermitian, but it is not positive-definite, it has negative eigenvalue and its determinant is zero (that makes it also non invertible). The error is raised by the Cholesky decomposition within lmekin, not in general.As far as I know, failing in this situation is the correct thing to do. I suspect that your matrix has negative eigenvalues close to 0. When these are within the numerical tolerance of the algorithm implementation, you don't get an error. You should be very careful with the results though.
The Cholesky decomposition makes use of the fact that the matrix is positive (semi-)definite so applying it when the basic assumptions are not met is risky business at best. I suggest you factorize your matrix with SVD.
You are indeed right. The matrix has negative eigenvalues close to 0 (i.e., -6.013708e-17).
With "factorize your matrix with SVD" do you mean by using SVD instead of Cholesky? To do so I need to modify the
lmekin
code, and this is not straightforward (not sure if it is feasible either). However, I have noticed that there exist functions that compute the nearest positive matrix, such asMatrix::nearPD
,lqmm::``make.positive.definite
andcorpcor::``make.positive.definite
. What is your opinion about using one of these positive definite matrices instead of the original one?Yes, I mean replace Cholesky with SVD or any another factorization (e.g. QR factorization) that doesn't make assumption on the matrix being positive definite. Alternatively, you could try to understand why the matrix is not positive definite and maybe work with a different representation of the data. For example if it's a covariance matrix, there may be highly correlated variables and you could work in PCA space. You could also add -λ to the diagonal of your matrix with λ being the smallest eigenvalue to make it positive semidefinite. Or you could compute the eigendecomposition, set the negative eigenvalues to 0 and reconstruct the matrix from its decomposition. These would minimally change the data because your matrix is not far from being positive definite. If you had negative eigenvalues far from 0, then finding the closest positive definite matrix would be the best option.
Thanks. Your reply helped me a lot!