Hi! I have a question about box plot and data. I made the correlation analysis Spearman. I see a positive connection between gene and phenotypic expression. For example, I have gene AAAA where 1 and 3 - homozygous and 2- heterozygote. I have information about temperature and I make graduation this information from 1 to 8 (1- 36.0-36.5; 2- 36.6 - 37.0 etc). Can I see how genotype correlation with phenotype in box plot graphic? (For example, if a patient has genotype 2 (C\T) so he\she can has temperature from 4 to 6).
Some example data.
df <- data.frame( patient=sprintf("C%s", seq_len(10)), genotype=sample(seq_len(3), 10, replace=TRUE), temperature=sample(seq_len(7), 10, replace=TRUE) ) > head(df, 5) patient genotype temperature 1 C1 1 4 2 C2 3 1 3 C3 1 3 4 C4 2 2 5 C5 1 6
A boxplot using ggplot2.
library("tidyverse") df %>% mutate(genotype=as_factor(genotype)) %>% ggplot(aes(x=genotype, y=temperature)) + geom_boxplot()
Or a stacked barplot as Hamid Ghaedi correctly pointed out.
df %>% mutate(across(!patient, as_factor)) %>% ggplot(aes(x=genotype, fill=temperature)) + geom_bar(position="fill") + scale_fill_viridis_d()
Important notice: Box-plot is usually use to show how a quantitative variable is distributed. Especially it does a great job when diffrneces in variance is significant between the groups and you expect to have some outliers ( Like gene expression data). Since you have discretized the tempreature data, using a box-plot for visualization does not make sense anymore, however, you can plot your original tempreature data as box-plot.
Now you have frequency data of tempreature per genotype, you can viusalize the data using a barplot (or prefreably stacked barplot).