Question: Box plot and genetic data
13 days ago by
L_LANKA0
L_LANKA0 wrote:

Hi! I have a question about box plot and data. I made the correlation analysis Spearman. I see a positive connection between gene and phenotypic expression. For example, I have gene AAAA where 1 and 3 - homozygous and 2- heterozygote. I have information about temperature and I make graduation this information from 1 to 8 (1- 36.0-36.5; 2- 36.6 - 37.0 etc). Can I see how genotype correlation with phenotype in box plot graphic? (For example, if a patient has genotype 2 (C\T) so he\she can has temperature from 4 to 6).

13 days ago by L_LANKA0

Can you edit your post and include some example data? It's difficult to help without more specific information.

Ok, no problem! :) I made a table and a temperature encryption key.

12 days ago by
rpolicastro2.0k
rpolicastro2.0k wrote:

Some example data.

``````df <- data.frame(
patient=sprintf("C%s", seq_len(10)),
genotype=sample(seq_len(3), 10, replace=TRUE),
temperature=sample(seq_len(7), 10, replace=TRUE)
)

patient genotype temperature
1      C1        1           4
2      C2        3           1
3      C3        1           3
4      C4        2           2
5      C5        1           6
``````

A boxplot using ggplot2.

``````library("tidyverse")

df %>%
mutate(genotype=as_factor(genotype)) %>%
ggplot(aes(x=genotype, y=temperature)) +
geom_boxplot()
``````

Or a stacked barplot as Hamid Ghaedi correctly pointed out.

``````df %>%
mutate(across(!patient, as_factor)) %>%
ggplot(aes(x=genotype, fill=temperature)) +
geom_bar(position="fill") +
scale_fill_viridis_d()
``````

12 days ago by
Hamid Ghaedi570
Hamid Ghaedi570 wrote:

Important notice: Box-plot is usually use to show how a quantitative variable is distributed. Especially it does a great job when diffrneces in variance is significant between the groups and you expect to have some outliers ( Like gene expression data). Since you have discretized the tempreature data, using a box-plot for visualization does not make sense anymore, however, you can plot your original tempreature data as box-plot.

Now you have frequency data of tempreature per genotype, you can viusalize the data using a barplot (or prefreably stacked barplot).