Question: How to do pca plot from .eigenval and .eigenvec files from Plink in R.
gravatar for amitgourav.ghosh12
13 days ago by
amitgourav.ghosh1230 wrote:

I have plink.eigenval and plink.eigenvec files after using --pca operation in plink. Can you please suggest me how to proceed in R to do a pca plot? Thank you!

For reference: plink.eigenval-

snp R • 162 views
ADD COMMENTlink modified 13 days ago by Devon Ryan77k • written 13 days ago by amitgourav.ghosh1230

I plot PC1 and PC2 from .eigenvec (column 3rd and 4th) as scatter plot in R. You can save these columns in csv (comma separated file) and plot them.

ADD REPLYlink written 13 days ago by BAGeno60

Yes, use the eigenvec file

ADD REPLYlink written 13 days ago by Kevin Blighe14k

Thank you Sir! I did use that file, now I am trying to add colours according to groups(ethnicities) in the graph.

ADD REPLYlink written 7 days ago by amitgourav.ghosh1230

Thank you very much for your kind advice! It was successful, now I am trying add colours to the plot according to their groups(ethnicities) and finding them to be a bit daunting.

Thank you very much again!

ADD REPLYlink written 7 days ago by amitgourav.ghosh1230

The most basic way to do that is to create a colour vector whose order matches the order of your samples in the input data, and then to use this vector via the col parameter to the plot() function.

samples <- c("GroupA","GroupB","GroupA","GroupA")
colour <- c("royalblue","firebrick1","royalblue","royalblue")
plot(..., col=colour)

There are other, more automate ways to do this, though. It is good practice to organise your metadata (including colouring) before starting a particular study, I have found. These can be regarded as initialisation and global variables.

ADD REPLYlink written 7 days ago by Kevin Blighe14k

Wow! Your advice might give me good head start about the ways to proceed or just to toy around the data to learn things in a playful manner. Please accept my gratitude for your kind guidance.

My file looks like the one down below (actually has 1977 individuals after filtering), I can plot the eigen values, but I want to colour(according to their ethnic groups) the dots and add the ethnicity(the second column) in the plot. I am wondering how can I accomplish that.

GA000217 Abkhasian 0.0147066 0.0363746 -0.0159528 0.0088663

Thank you!

ADD REPLYlink modified 7 days ago • written 7 days ago by amitgourav.ghosh1230

Okay, yes, 1977 is a lot. I presume that your first column is a unique sample ID, whereas the second column is some sort of super population. Abkhasia is around Syria, right?

If you want to colour by the super group, then there should only be a few unique values, overall. If the unique values were Abkhasians, Kurds, Turks, and Arabs, then you would do the following to create the colour vector automatically (assuming that your groups are in columns 1 and 2 of an object called eigenvec):

population <- factor(eigenvec[,2], levels=c("Abkhasians", "Kurds", "Turks", "Arabs"))
col.population <- colorRampPalette(c("royalblue", "red3", "limegreen", "gold"))(length(unique(population)))[factor(population)]

This should colour each group as per the order of the listed populations and colours

Try it out.

Note that you can also produce different shapes by supplying a vector of PCH values to the pch parameter of plot(). If the super-populations have 30, 40, 50, and 60 samples respectively, and are grouped together in that order, then we could do this with:

shape <- c(rep(16, 30), rep(15, 40), rep(16, 50), rep(16, 60))
ADD REPLYlink written 7 days ago by Kevin Blighe14k

Great! I feel this is exactly what I wanted. Thank you Sir!

Regarding Abkhazia, yes you are almost correct. This region is bit norther up in Georgia. Situated in the eastern coast of Black Sea. Last but not the least, it is a disputed territory of Georgia which enjoyed some autonomy while it was part of USSR. It is pivotal player nowadays in Russia-Georgia relationship.

ADD REPLYlink written 7 days ago by amitgourav.ghosh1230

Sounds interesting - I will visit some day.

ADD REPLYlink written 7 days ago by Kevin Blighe14k

I just finished counting the unique population groups in the samples. There were 254 different ethnicities in the sample size of 1977.

ADD REPLYlink written 6 days ago by amitgourav.ghosh1230

Visually, that will be difficult for our crappy human eyes to distinguish, no? The colour spectrum is mathematically infinite but our ability to distinguish 2 very similar colours is not.

Can you arrange them into even larger groups?

ADD REPLYlink written 6 days ago by Kevin Blighe14k

Yes I plan to do so, I tried plotting it in R using ggplot, the colour reference took up the whole area instead of the plot.

I managed to plot it in Excel with the colours, but I presume it is mostly for visualisation only. The name of the add-in is XLMiner Data Visualisation. If I point the cursor on any dot, it shows its coordinates and ethnicity.

Thank you!

ADD REPLYlink written 6 days ago by amitgourav.ghosh1230
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1828 users visited in the last hour