A colleague told me I have to collapse the probesets to genes. I have tried to do it through the collapseRows() function, but I'm experiencing some errors.
Here's my function call:
result <- collapseRows(datET=combat_edata, rowGroup=unique(combat_edata_genes), rowID=rownames(combat_edata), method="function", methodFunction=colMeans)
combat_edata dataframe is my output from the
ComBat() function that I used for batch correction, while the
combat_edata_genes list is the result of the retrieval of the genes associated to
"combat_edata in the
hsapiens_gene_ensembl mart, through the getBM() function.
As you can see, I used
combat_edata as my main input data, and I set
rowGroup as the list of unique genes. I set
rowID as the row names of the dataframe (features of the GEO dataset).
My call to collapseRows() returns an error:
Error: rowGroup and rowID not the same length"
Which is true. Here are the lengths of the variables:
length(rownames(combat_edata)): 33,297 length(unique(combat_edata_genes)): 27,217 length(combat_edata_genes): 47,127
The number of rows of my dataframe (the features of the GEO dataset) and the number of genes are different. I don't know how to handle the situation... Do you have any idea on how to handle this issue?
How can I solve perform the collapsing of the probesets to genes correctly?
EDIT: I don't necessarily have to use collapseRows(); if you know another method and you can explain me how to use it easily, you're welcome to propose it. Thanks