8.0 years ago

Khader Shameer
18k

I am working on a prediction problem that leverage sparse clinical datasets.

Missing data rate is in the range of 80%.

- I am wondering if there is any example of application of matrix completion to clinical or other datasets with such a missing rates.
- Currenlty exploring glmnet, pcaMethods and SoftImpute pacakages. I am also looking for R packages/SAS routines that can handle such sparse clinical data matrix and perform matrix completion.
- I would like assess the reliability of my filled-in values, is there any metric or score to assess the quality of the matrix completion.

PS. Cross-posted from here. I solved this problem using a method from pcaMethods; posting it here to get thoughts from biomedical / healthcare datasciencey folks here.