Entering edit mode
7.2 years ago
xingshunkai
•
0
Firstly, I have a snp-sets which associate with a particular disease. The snp-sets is filtered by GWAS, reference a bunch of papers(select 3-30 snps). Then I got a snps record came with 1000 sample(there is no phenotype info, only snp type is specific). I wanna clustering thoose data to classify those smaples or classify snp compositions based on the probability.
It may reflect the rank of disease such as: health, medium, heavy or different defect in the complicated pathway
So, how could I do that? Use what kind of Clustering method, probability model? How should I design the dissimilarity matrix?
Data
SNP Annotation
Snp Sample Record
Edit your question to add information. Do not create an answer as this makes your problem appear solved.