Question: correlation between expression and tumor size
0
gravatar for newbie
6 months ago by
newbie90
newbie90 wrote:

I have the following information in a dataframe data. First two columns are expression of two genes and third column is tumour size.

data:

structure(list(KRAS = c(1.84799690655495, 0.485426827170242, 
1.58496250072116, 0.925999418556223, 2.82781902461732, 1.53605290024021, 
1.37851162325373, 1.0703893278914, 0.765534746362977, 1.63226821549951, 
3.13750352374994, 0.84799690655495, 1, 0.137503523749935, 0.37851162325373, 
2.98185265328974, 1.13750352374994, 4.54225804976692, 1.53605290024021, 
4.53605290024021, 4.12928301694497, 0.84799690655495, 1.13750352374994, 
2.60880924267552, 1.13750352374994, 0.678071905112638, 0.765534746362977, 
0.84799690655495, 4.93545974780529, 0.584962500721156, 1.8073549220576, 
2.16992500144231, 1.13750352374994, 1.53605290024021, 1.32192809488736, 
1.72246602447109, 3.40599235967584, 1.72246602447109, 2.20163386116965, 
2.58496250072116, 0.584962500721156, 0.925999418556223, 1.0703893278914, 
1.37851162325373, 2.58496250072116, 0.765534746362977, 1.43295940727611, 
1.48542682717024, 2, 3.83794324189103), HRAS = c(2.88752527074159, 
2.88752527074159, 2.10433665981474, 0.925999418556223, 4.54843662469604, 
3.33628338786443, 3.30742852519225, 3.32192809488736, 1.58496250072116, 
4.41278152533848, 4.20945336562895, 3.92599941855622, 2.51096191927738, 
1.84799690655495, 3, 1.8073549220576, 3.01792190799726, 5.24412594328373, 
1.88752527074159, 5.6409679104499, 5.02680005934372, 3.877744249949, 
1.88752527074159, 2.13750352374994, 1.67807190511264, 0.925999418556223, 
2.48542682717024, 3.26303440583379, 5.95419631038687, 4.12928301694497, 
3.47248777146274, 3.91647664443772, 2.26303440583379, 3.96347412397489, 
0.678071905112638, 2.56071495447448, 4.65535182861255, 3.20163386116965, 
3.12101540096137, 3.62058641045188, 2.56071495447448, 1.32192809488736, 
1.84799690655495, 4.62643913669732, 3.91647664443772, 2.0703893278914, 
1.37851162325373, 1.48542682717024, 3.85798099512757, 4.12101540096137
), tumsize = c("6.5", "3", "3.5", "2.8", "1.3", "3.4", "2.4", 
"3.5", "5.7", "3.7", "4.5", "1.4", "3.6", "3.5", "5.5", "3", 
"3.4", "1.5", "5", "3", "1.7", "1.5", "1", "2.5", "3.3", "2.6", 
"1", "2.6", "0.5", "1.5", "2.5", "1.5", "2.3", "1.5", "3.6", 
"4.5", "3", "1.5", "4", "1.5", "2", "4", "5", "4.5", "2", "2.4", 
"2.5", "2.9", "5.2", "1.7")), row.names = c(1L, 2L, 3L, 4L, 5L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 25L, 26L, 28L, 33L, 34L, 35L, 36L, 37L, 38L, 
39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 
52L, 53L, 54L, 55L, 56L, 57L), class = "data.frame")

Correlation between two genes expression data was done using spearman correlation.

ggscatter(data, x = "KRAS", y = "HRAS", 
          size = 0.3,combine = TRUE, ylab = "HRAS",
          palette = "jco",add = "reg.line", conf.int = TRUE) + 
  stat_cor(method = "spearman")

enter image description here

And I'm interested in looking at correlation between KRAS expression and Tumor size. But I don't get the right way. Is something wrong in this?

ggscatter(data, x = "KRAS", y = "tumsize", 
          size = 0.3,combine = TRUE, ylab = "Tumor size",
          palette = "jco",add = "reg.line", conf.int = TRUE) + 
  stat_cor(method = "spearman")

`geom_smooth()` using formula 'y ~ x'
There were 18 warnings (use warnings() to see them)

And it looks like below. Can anyone tell me how to check correlation between expression and tumor size?

enter image description here

ADD COMMENTlink modified 6 months ago by Kevin Blighe71k • written 6 months ago by newbie90
2
gravatar for Kevin Blighe
6 months ago by
Kevin Blighe71k
Republic of Ireland
Kevin Blighe71k wrote:

It is not stated in your question what is your exact intention or where lies the problem (?).

First, you need to convert tumsize to numeric:

data$tumsize <- as.numeric(data$tumsize)

For correlation, you just need:

cor(x = data[['KRAS']], y = data[['tumsize']], method = 'spearman')
cor.test(x = data[['KRAS']], y = data[['tumsize']], method = 'spearman')

You can also build a linear regression model:

model <- lm(tumsize ~ KRAS, data = data)
summary(model)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2244 -0.8217 -0.2413  0.6913  3.5984 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.4527     0.3467   9.959  2.9e-13 ***
KRAS         -0.2982     0.1654  -1.803   0.0776 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.345 on 48 degrees of freedom
Multiple R-squared:  0.06344,   Adjusted R-squared:  0.04393 
F-statistic: 3.251 on 1 and 48 DF,  p-value: 0.07764

Odds ratios:

exp(cbind('OR' = coef(model), confint.default(model, level = 0.95)))

                    OR      2.5 %    97.5 %
(Intercept) 31.5853430 16.0098441 62.313779
KRAS         0.7421502  0.5366864  1.026273

Kevin

ADD COMMENTlink modified 6 months ago • written 6 months ago by Kevin Blighe71k

Oh yes. Wondering how come I didn't check this str(data). thanks a lot.

ADD REPLYlink written 6 months ago by newbie90
1
gravatar for dunja.vucenovic
6 months ago by
dunja.vucenovic10 wrote:

Hi, it looks like you are defining tumour size as character and not a numerical.

If you do str(data) you will see types of all variables in your dataframe

what you could do is to change tumsize in dataframe directly like data$tumsize <- as.numeric(data$tumsize)

ADD COMMENTlink written 6 months ago by dunja.vucenovic10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2530 users visited in the last hour
_