I have a very sparse SNP matrix (~90% missing genotypes by sample and ~90% missing samples per SNP) which I would like to perform some sort of probabilistic PCA on. I have been using the packages VariantAnnotation to get the my snpMatrix object and originally tried to mimic a method shown here (https://www.bioconductor.org/packages/release/bioc/vignettes/snpStats/inst/doc/pca-vignette.pdf ) with the package snpStats. However, I don't believe this package was intended to work with extremely sparse SNP matrices and it struggles to correct for missing values within the SNP matrix.
I have tried to use the ppca function from the package pcaMethods but have not had a huge amount of success in finding any clusters of cells. Does anyone have any experience working with very sparse matrices for pca?