我在此提出两个使用标准 for 语法的解决方案,两者都使用来自 Gábor Csárdi 和 Rich FitzJohn 的出色包 progress
- 1) 我们可以临时或本地覆盖
for 函数以环绕base::for 并支持进度条。
- 2) 我们可以定义未使用的
for<-,并使用语法pb -> for(it in seq) {exp} 环绕base::for,其中pb 是使用progress::progress_bar$new() 构建的进度条。
两种解决方案都作为标准调用:
- 上一次迭代中更改的值可用
- 发生错误时,修改后的变量将具有错误前的值
我打包了我的解决方案,并将在下面进行演示,然后将通过代码
用法
#devtools::install_github("moodymudskipper/pbfor")
library(pbfor)
使用pb_for()
默认情况下pb_for() 将覆盖for 函数仅运行一次。
pb_for()
for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
使用来自progress::progress_bar$new()的参数:
pb_for(format = "Working hard: [:bar] :percent :elapsed",
callback = function(x) message("Were'd done!"))
for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
使用for<-
与标准的for 调用相比,唯一的限制是第一个参数必须存在并且不能是NULL。
i <- NA
progress_bar$new() -> for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
我们可以定义一个自定义进度条,并且可以方便地在初始化脚本或 R 配置文件中定义它。
pb <- progress_bar$new(format = "Working hard: [:bar] :percent :elapsed",
callback = function(x) ("Were'd done!"))
pb -> for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
对于嵌套进度条,我们可以使用以下技巧:
pbi <- progress_bar$new(format = "i: [:bar] :percent\n\n")
pbj <- progress_bar$new(format = "j: [:bar] :percent ")
i <- NA
j <- NA
pbi -> for (i in 1:10) {
pbj -> for (j in 1:10) {
# DO SOMETHING
Sys.sleep(0.1)
}
}
请注意,由于运算符优先级,调用 for<- 并受益于 for 调用语法的唯一方法是使用从左到右的箭头 ´->´。
它们的工作原理
pb_for()
pb_for() 在其父环境中创建一个for 函数对象,然后是新的for:
- 设置进度条
- 修改循环内容
- 在循环内容表达式的末尾添加
`*pb*`$tick()
- 在干净的环境中将其反馈给
base::`for`
- 在退出时将所有修改或创建的变量分配给父环境。
- 如果
once 是TRUE(默认值)则删除自己
覆盖操作符通常很敏感,但它会自行清理,如果在函数中使用不会影响全局环境,所以我认为使用起来足够安全。
for<-
这种方法:
- 不会覆盖
for
- 允许使用进度条模板
- 有一个可以说更直观的 api
但是它有一些缺点:
它的作用:
- 使用辅助函数查找第一个参数的名称
- 克隆进度条输入
- 编辑它以考虑循环的迭代次数(
for<- 的第二个参数的长度
在此之后,它类似于上面部分中为 pb_for() 描述的内容。
代码
pb_for()
pb_for <-
function(
# all args of progress::progress_bar$new() except `total` which needs to be
# infered from the 2nd argument of the `for` call, and `stream` which is
# deprecated
format = "[:bar] :percent",
width = options("width")[[1]] - 2,
complete = "=",
incomplete = "-",
current =">",
callback = invisible, # doc doesn't give default but this seems to work ok
clear = TRUE,
show_after = .2,
force = FALSE,
# The only arg not forwarded to progress::progress_bar$new()
# By default `for` will self detruct after being called
once = TRUE) {
# create the function that will replace `for`
f <- function(it, seq, expr){
# to avoid notes at CMD check
`*pb*` <- IT <- SEQ <- EXPR <- NULL
# forward all arguments to progress::progress_bar$new() and add
# a `total` argument computed from `seq` argument
pb <- progress::progress_bar$new(
format = format, width = width, complete = complete,
incomplete = incomplete, current = current,
callback = callback,
clear = clear, show_after = show_after, force = force,
total = length(seq))
# using on.exit allows us to self destruct `for` if relevant even if
# the call fails.
# It also allows us to send to the local environment the changed/created
# variables in their last state, even if the call fails (like standard for)
on.exit({
vars <- setdiff(ls(env), c("*pb*"))
list2env(mget(vars,envir = env), envir = parent.frame())
if(once) rm(`for`,envir = parent.frame())
})
# we build a regular `for` loop call with an updated loop code including
# progress bar.
# it is executed in a dedicated environment and the progress bar is given
# a name unlikely to conflict
env <- new.env(parent = parent.frame())
env$`*pb*` <- pb
eval(substitute(
env = list(IT = substitute(it), SEQ = substitute(seq), EXPR = substitute(expr)),
base::`for`(IT, SEQ,{
EXPR
`*pb*`$tick()
})), envir = env)
}
# override `for` in the parent frame
assign("for", value = f,envir = parent.frame())
}
for<-(和fetch_name())
`for<-` <-
function(it, seq, expr, value){
# to avoid notes at CMD check
`*pb*` <- IT <- SEQ <- EXPR <- NULL
# the symbol fed to `it` is unknown, R uses `*tmp*` for assignment functions
# so we go get it by inspecting the memory addresses
it_chr <- fetch_name(it)
it_sym <-as.symbol(it_chr)
# complete the progress bar with the `total` parameter
# we need to clone it because progress bars are environments and updated
# by reference
pb <- value$clone()
pb$.__enclos_env__$private$total <- length(seq)
# when the script ends, even with a bug, the values that have been changed
# are written to the parent frame
on.exit({
vars <- setdiff(ls(env), c("*pb*"))
list2env(mget(vars, env),envir = parent.frame())
})
# computations are operated in a separate environment so we don't pollute it
# with it, seq, expr, value, we need the progress bar so we name it `*pb*`
# unlikely to conflict by accident
env <- new.env(parent = parent.frame())
env$`*pb*` <- pb
eval(substitute(
env = list(IT = it_sym, SEQ = substitute(seq), EXPR = substitute(expr)),
base::`for`(IT, SEQ,{
EXPR
`*pb*`$tick()
})), envir = env)
# because of the `fun<-` syntax we need to return the modified first argument
invisible(get(it_chr,envir = env))
}
帮手:
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
address2 <- getFromNamespace("address2", "pryr")