【发布时间】:2014-06-06 05:28:21
【问题描述】:
当我在不注释gr.ascent(MMSE, 0.5, verbose=TRUE) 的情况下运行以下代码时,我收到此错误Error in b1 * x : 'b1' is missing,但是当我注释该行时,我在使用这些参数MMSE(2,1,farmland$farm,farmland$area) 测试MMSE 时收到以下错误。你知道我的问题出在哪里吗?
Error in if (abs(t[i]) <= k) { : missing value where TRUE/FALSE needed
这是我的代码:
farmland <- read.csv("FarmLandArea.csv")
str(farmland)
fit=lm(farm~land,data=farmland)
mean.squared.residuals <- sum((lm(farm~land,data=farmland)$residuals)^2)/(length(farmland$farm)-2)
#gradient descent
#things I should possibly use: solve(t(X)%*%X, t(X)%*%y)
gr.ascent<- function(df, x0, alpha=0.2, eps=0.001, max.it = 50, verbose = FALSE){
X1 <- x0
cond <- TRUE
iteration <- 0
if(verbose) cat("X0 =",X1,"\n")
while(cond){
iteration <- iteration + 1
X0 <- X1
X1 <- X0 + alpha * df(X0)
cond <- sum((X1 - X0)^2) > eps & iteration < max.it
if(verbose) cat(paste(sep="","X",iteration," ="), X1, "\n")
}
return(X1)
}
k=19000
#rho <- function(t, k=19000){
# for (i in seq(1,length(t))){
# if (abs(t[i]) <= k)
# return(t[i]^2)
# else
# return(2*k*abs(t[i])-k^2)
# }
#}
#nicer implementation of rho. ifelse works on vector
rho<-function(t,k) ifelse(abs(t)<=k,t^2,(2*k*abs(t))-k^2)
rho.prime <- function(t, k=19000){
out <- rep(NA, length(t))
for (i in seq(1,length(t))){
if (abs(t[i]) <= k)
{ print(2*t[i])
out[i] <- 2*t[i]
}
else
{
print(2*k*sign(t[i]))
out[i] <- 2*k*sign(t[i])
}
}
return(out)
}
MMSE <- function(b0, b1, y=farmland$farm, x=farmland$land){
# Calls rho.prime() here with argument y-b0-b1*x
#Why should we call rho.prime? in the html page you have used rho!?
n = length(y)
total = 0
for (i in seq(1,n)) {
#total = total + rho(t,k)*(y[i]-b0-b1*x[i])
total = total + rho.prime(y-b0-b1*x,k)*(y[i]-b0-b1*x[i])
}
return(total/n)
}
gr.ascent(MMSE(1,2), 0.5, verbose=TRUE)
其中FarmLand csv数据如下:
state,land,farm
Alabama,50744,14062
Alaska,567400,1375
Arizona,113635,40781
Arkansas,52068,21406
California,155959,39688
Colorado,103718,48750
Connecticut,4845,625
Delaware,1954,766
Florida,53927,14453
Georgia,57906,16094
Hawaii,6423,1734
Idaho,82747,17812
Illinois,55584,41719
Indiana,35867,23125
Iowa,55869,48125
Kansas,81815,72188
Kentucky,39728,21875
Louisiana,43562,12578
Maine,30862,2109
Maryland,9774,3203
Massachusetts,7840,812
Michigan,58110,15625
Minnesota,79610,42031
Mississippi,46907,17422
...
这是 dput(farmland) 的结果:
> dput(farmland)
structure(list(state = structure(1:50, .Label = c("Alabama",
"Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut",
"Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois",
"Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine",
"Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi",
"Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire",
"New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota",
"Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island",
"South Carolina", "South Dakota", "Tennessee", "Texas", "Utah",
"Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin",
"Wyoming"), class = "factor"), land = c(50744L, 567400L, 113635L,
52068L, 155959L, 103718L, 4845L, 1954L, 53927L, 57906L, 6423L,
82747L, 55584L, 35867L, 55869L, 81815L, 39728L, 43562L, 30862L,
9774L, 7840L, 58110L, 79610L, 46907L, 68886L, 145552L, 76872L,
109826L, 8968L, 7417L, 121356L, 47214L, 48711L, 68976L, 40948L,
68667L, 95997L, 44817L, 1045L, 30109L, 75885L, 41217L, 261797L,
82144L, 9250L, 39594L, 66544L, 24230L, 54310L, 97105L), farm = c(14062L,
1375L, 40781L, 21406L, 39688L, 48750L, 625L, 766L, 14453L, 16094L,
1734L, 17812L, 41719L, 23125L, 48125L, 72188L, 21875L, 12578L,
2109L, 3203L, 812L, 15625L, 42031L, 17422L, 45469L, 95000L, 71250L,
9219L, 734L, 1141L, 67500L, 10938L, 13438L, 61875L, 21406L, 55000L,
25625L, 12109L, 109L, 7656L, 68281L, 17031L, 203750L, 17344L,
1906L, 12578L, 23125L, 5703L, 23750L, 47188L)), .Names = c("state",
"land", "farm"), class = "data.frame", row.names = c(NA, -50L
))
【问题讨论】:
-
您的错误似乎在指定的行中:
if (abs(t[i]) <= k)。看来您的比较返回NA而不是TRUE/FALSE之一。您可能会遇到类似于:if(NA < 1) "do this"的类似错误。 -
但是为什么 NA 会在这里发生呢?你能再解释一下吗?
-
你能把'dput(farmland)'的结果粘贴到你的问题中吗?
-
您的代码看起来像是被混淆了...这里有两个建议:最好将一些抛出错误的行注释掉并希望其余代码能够正常工作;设置
options(error=recover)在发现错误时触发调试器,这样您可以更轻松地调查和理解问题。
标签: r function statistics regression gradient-descent