glmnet package lasso error
0
1
Entering edit mode
6.3 years ago

Hello I have been trying to run a lasso analysis using glmnet with the help of this tutorial:

I have turned both my X and Y tables into matrices and I have no idea why it is still not working.

This is how my code lookslike

library(glmnet)
muscleY1 <- as.matrix(muscleY)
is.matrix(muscleY1)
as.matrix(muscleX)
muscleX1 <- as.matrix(muscleX)
is.matrix(muscleX1)
##

CV = cv.glmnet(x=muscleX1, y=muscleY1, family= "gaussian", type.measure = "class", alpha = 1, nlambda = 100)
##

plot(CV)
##

fit = (glmnet(x=muscleX.xlsx, y=muscleY.xlsx, family= "poisson", alpha=1, lambda=CV$lambda.1se)

##

fit$beta[,1]

I'm getting the following message when I run the 8th line:

Error in fishnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 4) In addition: Warning messages: 1: In storage.mode(y) = "double" : NAs introduced by coercion 2: In fishnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NAs introduced by coercion

I have tried removing the names of my patients from the columns, because I assumed that letters were not allowing it to be interpreted as a matrix, leaving only numbers, and now I'm getting this error instead:

In cv.fishnet(list(list(a0 = c(1.09743483746064, 1.08940083327634, : Only 'deviance', 'mse' or 'mae' available for Poisson models; 'deviance' used

Can anyone tell what's wrong?

glmnet lasso multilinear regression • 17k views
ADD COMMENT
2
Entering edit mode

Provide a reproducible example (maybe a very small subset of your data). Otherwise, read, read, read as if there is no tomorrow.

ADD REPLY
2
Entering edit mode

When using as.matrix() on a data.frame containing characters, you're converting everything to character type, hence the warnings you get about NAs introduced by coercion then you get an error because your matrices are full of NAs. The second warning you're getting is because only deviance is acceptable as argument to type.measure. Also in both glmnet() and cv.glmnet(), the lambda parameter should be a sequence of values for lambda.

ADD REPLY
0
Entering edit mode

Thank you for the replies

This is what my data looks like

muscleX1

 ;Mzscore+3
1;3.127704
2;5.014803
3;2.459868
4;2.186171
5;1.386021
6;2.476046
7;3.417171
8;2.358869
9;3.411687
10;5.377427

muscleY1

;ENSG00000000003;ENSG00000000419;ENSG00000000457;ENSG00000000460;ENSG00000000938;ENSG00000000971;ENSG00000001036;ENSG00000001084;ENSG00000001167;ENSG00000001460;ENSG00000001461;ENSG00000001497;ENSG00000001561;ENSG00000001617;ENSG00000001629;ENSG00000001630;ENSG00000001631;ENSG00000002016;ENSG00000002079
    1;276;1572;1480;585;276;380;2179;1595;3744;1117;693;2218;1446;73;5084;2869;619;998;100
    2;502;1369;2256;793;502;515;3312;4926;4420;2226;1698;3545;2872;114;8790;1735;131;1606;382
    3;367;2366;1862;392;367;429;5933;2311;4241;3022;1222;6031;3488;1082;4689;4260;353;1697;200
    4;402;1095;1677;612;402;750;3543;3091;3986;1777;731;2115;2932;83;4356;2668;530;1189;130
    5;669;2058;1950;744;669;488;2451;1649;3286;1327;1448;789;3120;112;3931;1289;293;1098;1092
    6;383;1839;1742;661;383;286;8124;3105;1748;2595;1536;6146;2481;98;6196;2889;318;1339;379
    7;569;5947;2768;540;569;754;11290;5267;12857;979;680;5010;4631;150;13457;1699;124;649;275
    8;573;2375;1857;403;573;461;3679;2901;2013;1370;750;2135;3948;102;3744;2860;416;1168;259
    9;1322;1713;1035;1229;1322;460;2292;1762;12873;3550;615;4573;2877;531;7249;4194;357;2173;798
    10;1332;6895;2018;839;1332;214;2293;383;286;8124;383;286;8124;383;286;8124;383;286;8124

because only deviance is acceptable as argument to type.measure

What is deviance? From a brief read, is it a unit I need to transform my data into? Sorry, I'm a bio undergrad, so I know very little about statistics.

Also in both glmnet() and cv.glmnet(), the lambda parameter should be a sequence of values for lambda.

This I think I understand. You mean

CV = cv.glmnet(x=muscleX1, y=muscleY1, family= "gaussian", type.measure = "class", alpha = 1, nlambda = 100) correct? I did find it odd only 100 was provided as lambda, I assumed it would use 100 different lambda values or something. Should it be like

CV = cv.glmnet(x=muscleX1, y=muscleY1, family= "gaussian", type.measure = "class", alpha = 1, nlambda = 0.001, 0.01, 0,1, 1, 10, 100)

then?

Otherwise, read, read, read as if there is no tomorrow.

thank you, I'm looking into it!

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6