Entering edit mode

2.8 years ago

Vasu
▴
560

I have RNA-Seq data for 300 samples. In which 250 are Tumor and 50 are Normals. I have a matrix with genes as rows and samples as columns.

There are almost 56k genes as rows. Among these genes there are also lncRNAs.

I would like to check the correlation between a specific lncRNA and all other protein coding genes. I want the value of `R`

(correlation co-efficient).

How to do this for one lncRNA vs all protein coding genes in the genome?

so, with this

`cor`

how to proceed further? I'm interested in doing spearman correlation.Set the

`method`

argument. See here: https://www.rdocumentation.org/packages/stats/versions/3.5.1/topics/corSorry, I'm a bit confused. lets say I have matrix

`A`

like below. Ensembl ids as rows and Samples as columns. Using raw counts I used`cpm`

function and converted them to logCPM values like below.Now, in this I want to check the correlation of

`ENSG00000000005.5`

on all other Ensembl ids.This is just an example data I'm showing. I have a single lncRNA and around 19k protein coding genes with logCPM values. How to apply the above function on this? And how to plot that with R (correlation coefficient value)?

my_cor <- apply(my_cpm, 1, function(x){cor(x,count["ENSG00000000005.5",], method = "spearman")})

I don't think plotting the correlation coefs would be particularly revealing; but you can do it if you want

thanks a lot. I got the correlation coefficient values (R). This could tell whether the lncRNA has strong, moderate or weak correlation with other protein coding genes. But I have a small question what is R square in correlation? What does R square tell?

Short description. if you have a pair of a variable (X and Y) then value

`R^2`

and`r^2 (output of cor)`

is the same. However, power of R^2 comes into the picture in multiple linear regression problem where multiple variables simultaneously used to predict the response.Reference :

Excerpt From: Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. βAn Introduction to Statistical Learning.β iBooks.

You should maybe start by reading about statistics before going further into your analysis π