Question: Optimize the parameter span for loess (or spar for smooth.spline)
1
0
Entering edit mode
6.8 years ago
wenbo • 0

Hi,

Does anyone know well about the parameter optimization for loess or smooth.spline?

In function loess or smooth.spline, the parameter span (or spar for smooth.spline) is very important for the fitting. I have tried to use the following method to optimize the parameter spar for smooth.spline:

tuneSpline = function(x,y,span.vals=seq(0.1,1,by=0.05),fold=10){
mae <- numeric(length(span.vals))
fun.fit <- function(x,y,span) {smooth.spline(x = x,y = y,spar = span)}
fun.predict <- function(fit,x0) {predict(fit,x0)$y} ii = 0 for(span in span.vals){ ii <- ii+1 y.cv <- crossval(x,y,fun.fit,fun.predict,span=span,ngroup = fold)$cv.fit
fltr <- !is.nay.cv)
save(fltr,y.cv,y,file="tmp.rda")
mae[ii] <- mean(abs(y[fltr]-y.cv[fltr]))
}
span <- span.vals[which.min(mae)]
return(span)
}
require(graphics)
attach(cars)
tuneSpline(speed,dist,fold = length(dist))
## return 0.1

But the optimized spar by this method is always 0.1 (the minimum value in span.vals).

It's weird and I think the result may not be right.

Best regards!

XianWu

loess smooth.spline R • 5.2k views
0
Entering edit mode

Your code contains a couple of typos, namely missing argument 'y' on the third line, and missing opening bracket on the line with 'fltr' variable assignment. After fixing these, I ran the code and got 0.75 as the optimal value.

PS: I don't see how this question relates to bioinformatics, it's more suitable for StackOverflow

0
Entering edit mode

Pay attention to what you're doing. Why would you want to 'tune' the span of loess?

loess is used to make a line that doesnt exactly hit all the points; if you are measuring accuracy as how far away the points are from the line, then it follows that the optimal parameter is approaching zero. Now you've connected all the dots and made a zigzag. People use loess because they want a smooth curve that may miss points with assumed error, an optimal value is subjective simplicity; or maybe the 2nd derivative (curve sharpness)

1
Entering edit mode

It's a valid method of non-parametric estimation because of the use of cross-validation. See e.g. http://www.uab.ro/auajournal/upload/49_539_Nicoleta_Breaz-2.pdf for details.

0
Entering edit mode

if you run smooth.spline with the same data but without passing a spar parameter, is the fitted smoothing parameter that is returned by R less than 0.1?

Where did you get crossval from? The params for crossval::crossval are (predfun, X, Y, K, B, verbose, ...) so your ordering doesn't match that

0
Entering edit mode
6.8 years ago
russhh 5.6k

With a working implementation you're approach works fine.

tuneSpline = function(x,y,span.vals=seq(0.1,1,by=0.05),fold=10){
require(bootstrap)
fun.fit <- function(x,y,span) {smooth.spline(x = x,y = y,spar = span)}
fun.predict <- function(fit,x0) {predict(fit,x0)$y} mae <- sapply(span.vals, function(span){ y.cv <- bootstrap::crossval( x,y,fun.fit,fun.predict,span=span,ngroup = fold )$cv.fit
fltr <- which(!is.na(y.cv))
mean(abs(y[fltr]-y.cv[fltr]))
})
span.vals[which.min(mae)]
}

attach(cars)
tuneSpline(speed,dist,fold = length(dist))

# 0.75

0
Entering edit mode

Do you think this method is good?

0
Entering edit mode

When I used the code like below:
smooth.spline(speed,dist)\$spar

I got the value of spar is "0.7801305". It's very similar to the value of tuneSpline. It looks like that smooth.spline can do the optimization and select the best spar value for fitting.