scatter plot for DAPC
1
1
Entering edit mode
4.6 years ago
evelyn ▴ 230

I am doing DAPC for SNP dataset using:

library(adegenet)
x<-input_file
x1<-as.data.frame(t(x))
gen<-as.genlight(x1)
grp<-find.clusters(gen,max.n.clust = 10)
dapc1<-dapc(gen,grp$grp)
scatter(dapc1)

But the scatter plot shows clusters without the individual samples represented as dots. I want to make a scatter plot which represents individual samples as dots in the clusters. I am not sure what is wrong or missing in my code. Thank you for the help!

snp • 4.4k views
ADD COMMENT
0
Entering edit mode

What does the output look like? I don't understand what clusters shown without data points mean.
Some possibilities that come to mind: the scatter() function could have been redefined in your environment or your data points all fall into the cluster centres. Try using parameter pch = 19 and other customization parameters to make the points visible.

ADD REPLY
0
Entering edit mode

For results, I am getting the center boxes. I have found an example here similar to what I am getting. I think customization will work to modify the data points. But in my case, there are no data points shown.

ADD REPLY
1
Entering edit mode

If your plot is like the example you linked to, the issue seems to be that the data points all pile up onto the cluster centres and are masked by the the cluster label (i.e. the square with the cluster number). Check the next plot in the example to see what I mean.

ADD REPLY
0
Entering edit mode

This is exactly what I happened to me. The Labels were covering up the individuals. I thought I was looking my mind. thank you!

ADD REPLY
0
Entering edit mode

Hello Jenna, could you then solve the problem? I see at the link above (by Jean-Karim Heriche) there is a graph with the same problem, but I could not find the solution ANYWHERE!

ADD REPLY
0
Entering edit mode

If your data points pile up on top of each other, this means they have the same coordinates. The solution is to make the coordinates different either upstream by changing the way you process your data or at the plot level by adding a bit of noise to the coordinates. This is what the R jitter() function is for.
If points are covered by labels then remove the labels. It's generally a bad idea to put labels in a plot for exactly this reason, a legend is preferable or if that's not suitable, putting the labels outside the plot and using light lines to link them to points or areas in the plot.

ADD REPLY
0
Entering edit mode

Thank you very much Jean-Karim, the problem is that the points have not the same coordinates, neither the labels cover them: I know this because if I plot the data using a simple plot function

plot(dapc1$tab[,1:2], col=grp$grp, pch=c(grp$grp))

then I can clearly see all the points!

How do you suggest to proceed? Thank you again!!

Plot function

Scatter plot

ADD REPLY
0
Entering edit mode

So this is probably a different problem and belongs in its own post. Also probably not bioinformatics-specifc but an R programming question more suited to StackOverflow maybe. I would look into how you wrote your plotting code, like maybe plotting only the labels. Again I think plotting labels and the points is a bad idea. Convert the labels into a legend and color/shape the points accordingly.

ADD REPLY
0
Entering edit mode
3.9 years ago
adrien.cv • 0

Hi, I tried to modify the parameter "pch" in order to view samples and the ellipse around the flag number, but nothing change.

scatter(dapc6, posi.da="bottomright", bg="white",pch=17:22, cstar=0, col=myCol, scree.pca=TRUE, posi.pca="bottomleft", txt.leg=paste("Cluster", 1:6), legend=TRUE)

Someone have an idea ? Thank you for the help !

ADD COMMENT
0
Entering edit mode

Don't post a question as an answer to a previous question. Use a comment and clarify how this relates to the original question or post a new question. Either way, what you're trying to achieve is not clear.

ADD REPLY
0
Entering edit mode

Hello adrien.cv. I am having the same issue you are describing with scatter(dapc). The plot shows 4 groups (as in 4 numbered boxes) but no individual samples. I know it was a long time ago, but were you able to identify and fix this issue with you data ? Thank you!

ADD REPLY
0
Entering edit mode

I have the same problem!

ADD REPLY

Login before adding your answer.

Traffic: 2214 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6