Question: Multidimentional Scaling Analysis
1
gravatar for User000
4.7 years ago by
User000270
User000270 wrote:

I am analysing Genotyping-by-sequencing data. I have .vcf file as a result. I have 17 individuals with different SNP loci and I have filtered and removed SNP loci with more than 90% missing data. I was wondering if it makes sense to do a MDS analysis to see the distances between these 17 individuals using PLINK, also using the same program I have obtained IBS, and produced a dendrogram using R. Does this make sense? Any suggestions comments or giudeline is appreciated.

P.S. 17 individuals appertain to the same specie but they have been grown in different places.

gbs snp mds plink • 1.4k views
ADD COMMENTlink modified 4.7 years ago by tommivat240 • written 4.7 years ago by User000270
1

A bit off-topic, but I bet that as SNPs are quite sparse in 17 individuals, some more sophisticated clustering (that compensates the missing information by using known gene-gene relationships) is needed. I would recommend to check out this paper (http://www.nature.com/nmeth/journal/v10/n11/full/nmeth.2651.html) that utilizes gene networks to perform a robust clustering of SNP profiles.

ADD REPLYlink written 4.7 years ago by mikhail.shugay3.3k
1
gravatar for tommivat
4.7 years ago by
tommivat240
Finland
tommivat240 wrote:

This is a very common visualization problem, where the most important part is to define a proper similarity function to obtain similarity matrix for your individuals. By quick google, I found this paper which may give you some starting points. After you have generated a similarity matrix (you might already have a good one) you can use any visualization tool which takes this matrix as an input. My current favourite is t-SNE which has turned out to be very good for many complex data. If you are satisfied with the MDS result, there is no reason to try any other (possible more advanced) techniques, though.

ADD COMMENTlink written 4.7 years ago by tommivat240

thanks,I am having a look at the paper. I created a matrix using plink, there is a command line that does all the job, and visualized using R, not sure If I can trust this analysis though...

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by User000270
0
gravatar for Devon Ryan
4.7 years ago by
Devon Ryan90k
Freiburg, Germany
Devon Ryan90k wrote:

It'll never hurt anything to do an MDS, in fact it can only be neutral or helpful.

ADD COMMENTlink written 4.7 years ago by Devon Ryan90k

In literature MDS is used to assign many individuals to several groups, while I have only 17 individuals with many SNP loci, so basically I can observe their distance and relation on the basis of SNP?
 

ADD REPLYlink written 4.7 years ago by User000270

It'll depend on the population structure. You might give it a go in any case. Also do have a look at the paper mikhail.shugay linked to.

ADD REPLYlink written 4.7 years ago by Devon Ryan90k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1079 users visited in the last hour