Aggregate data frame using function with a logical criterion
0
0
Entering edit mode
2.6 years ago
friasoler ▴ 30

I would like to detect which of the elements in the first column (gene) have an ambiguous value in the second column (log2 fold.change). Note that "Golgi integral membrane protein 4-like" has different sign of log2 fold change, but "protein HEG homolog 1" no. I would like to have a final data frame with the mean of the log2 fold-change of the ambiguous genes and "NA" in the rest. I always get the mean irrespective they are or not ambiguous.

df:
Gene                                                       l2fch
Golgi integral membrane protein 4-like  0.308
Golgi integral membrane protein 4-like  -0.35
protein HEG homolog 1                           -2.92
protein HEG homolog 1                           -5.92
centlein                                                -1.4760831106834
HAUS augmin-like complex subunit 6  0.319711425528765

Code:

df2=aggregate(.~Gene,df, function(h){if (max(h)*min(h)>0) mean(h) else NA})
R aggregate • 397 views
ADD COMMENT
0
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

You're computing max(h). Have you tried looking at what h actually is? I have a hunch it might be a set of rows/vectors and not a single vector of l2fch values. You might need to use max(h$l2fch) or max(h[2]) or something along those lines.

ADD REPLY

Login before adding your answer.

Traffic: 3147 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6