Entering edit mode
6.5 years ago
Devyani ▴ 50
I would like to know how can we find the correlation between miRNA and gene expression data from TCGA. The data available in TCGA for miRNA and gene are in two different units like its RPKM for the gene and reads per million for miRNA.
Should both the data be checked for normal distribution before going for lm?
How much I know TCGA data doesn't follow normal distribution.
I don't think you require data to be normally distributed for a linear regression
well, i was thinking that the data (TCGA) is transformed data (normalized). My question is other way round. If data is normalized (assuming that TCGA data in use, is normalized), can you still do lm?
TCGA expression data is generally normalized, though you can also get the count data. Linear regression doesn't require your data to be normally distributed so I think you can use linear regression. If your uncomfortable using regressions you can use a non-parametric test like a Spearman's correlation which is independent of the nature of the distribution (given you have enough power).