Hi all,
I'm looking to use some publicly available data from samples that have had proteomics and bulk RNAseq done on them. However, I'm having issues resolving the actual correlation and assigning the metadata.
For example, this is my workflow for the RNA file:
#Load data as csv. First column is gene ID with gene names going down, every other column
#has the sample name with normalized expression values. Remove duplicates in gene name (ID)
geneExpf="file.csv"
rna=read.table(geneExpf, as.is=TRUE, header=TRUE, sep=',', check.names=FALSE)
rna = distinct(rna, ID, .keep_all = TRUE)
#Re-label data as matrix and make expression values numeric
rna_matrix=as.matrix(as.numeric(rna[,2:ncol(rna)]))
#Make new matrix of rna genes present in protein set and remove duplicates
rna_common = as.matrix(rna[rna$ID %in% protein$ID,])
#Make correlation table for downstream use (corrplot, etc) and ignore NA values
cormatrix <- cor(rna, protein, use = "pairwise.complete.obs")
However, cor returns back an error that reads:
'x' must be numeric
Even though I tried to make the expression values numeric. When I typeof rna[2,5] which is an arbitrary gene expression value for a sample it reads as a character and sometimes a double depending on the file, though I can't figure out why.
Does anyone have any suggestions for how to fix this issue? I'm spent a long time looking on stackexchange and biostars with people having the same issue, but few of them have been reading from csv files which I think might be my mistake.
Thanks for everyone's help. Greatly appreciated.
Can you show the result of
str(rna)
andstr(rna_matrix)
? You probably have some non-numeric entries in either one of the columns, e.g."NA"
or something else that will be read in as a string rather than a number.str(rna) reads:
whereas str(rna_matrix) is actually no longer working for me-- it's saying:
just for completeness sake, try to understand what's happening:
EDIT: That being said, since
str(rna)
already indicated that all the relevant columns were already numeric, you wouldn't even need to enforce it viaas.numeric()
.