How to add color scheme for many groups in PCA using R?
1
1
Entering edit mode
2.3 years ago

Hi,

I am making PCA plot in R using PCAtools for different group of individuals. The are 65 groups in the sample metadata. At the moment, for smaller groups I simply specify color for each of them in colkey (for instance, "Group 1" = "#004181", "Group 2" = "#0077BA", ................, etc). I am not sure how to add colors for more number of groups and manually specifying them would be tedious. Is there a better way to do this?

Plot_Type = "PCA2D"
pdf(paste0(Plot_Type,"_PC1_vs_PC2.pdf"),height = 7,width = 10)
biplot(p,
       x = 'PC1', y = 'PC2',
       lab = NULL,
       colby = 'Group', colkey = c('Group 1' = '#004181', 'Group 2' = '#0077BA', 'Group 3' = '#4E564D', 'Group 4' = '#6BA3D7', 'Group 5' = '#8B63FF'),
       legendPosition = 'right', legendLabSize = 13, legendIconSize = 3.0,
       subtitle = 'PC1 vs. PC2')
dev.off()

I will appreciate any help. Thank you!

Best Regards,

Toufiq

R biplot PCA PCAtools ggplot2 • 4.1k views
ADD COMMENT
3
Entering edit mode
2.3 years ago
zx8754 11k

Using rainbow to create n colours, then setNames to create a named vector:

# using 5 as a test, change it to as needed, 65?
n = 5

myCols <- setNames(rainbow(n), paste("Group", 1:n))

myCols
#   Group 1   Group 2   Group 3   Group 4   Group 5 
# "#FF0000" "#CCFF00" "#00FF66" "#0066FF" "#CC00FF" 

# then use within your code:
colby = "Group", colkey = myCols,

Note:

  • No one can tell a difference between 65 colours.
  • There are other packages/functions to create pallets with n number of colours, instead of rainbow.
ADD COMMENT
0
Entering edit mode

@zx8754

Thank you very much for the prompt response and additional reference link. True, 65 is difficult to distinguish, but perhaps, would likely assist if there are lower numbers of groups going forward. I like randomcoloR library for further analysis. How to generate a number of most distinctive colors in R

Apart from being labelled as Group 1, Group 2, there are other groups like 3B, 3C, 3ZX, 7DG, 90CC, HC, Cont1,.... Meaning 65 groups are non-uniformly labelled.

If I try `rainbow` palette, then this does not work. 

myCols <- setNames(rainbow(n), paste("Group", 1:n))
ADD REPLY
2
Entering edit mode

replace paste("Group", 1:n) with a character vector containing group names.

> setNames(rainbow(5), paste("Group", 1:5))
  Group 1   Group 2   Group 3   Group 4   Group 5 
"#FF0000" "#CCFF00" "#00FF66" "#0066FF" "#CC00FF" 

> setNames(rainbow(5), colnames(iris))
Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
   "#FF0000"    "#CCFF00"    "#00FF66"    "#0066FF"    "#CC00FF" 
ADD REPLY
0
Entering edit mode

cpad0112,

thank you very much. This works.

ADD REPLY

Login before adding your answer.

Traffic: 1734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6