【问题标题】:How to log when using foreach (print or futile.logger)使用 foreach 时如何记录(print 或 futile.logger)
【发布时间】:2016-12-14 04:16:02
【问题描述】:

我想将foreach 包与日志记录结合使用。我通常使用futile.logger 包。当工作分配给工作人员时,日志信息丢失(这很奇怪,因为您需要指出 foreach 日志包)

我见过this post,但它不使用foreach

  library(foreach)                                                                                                                                                                                                                                                                                                       
  library(futile.logger)                                                                                                                                                                                                                                                                                                 
  library(doParallel)                                                                                                                                                                                                                                                                                                    
  flog.threshold(DEBUG)                                                                                                                                                                                                                                                                                                  
  cluster <- makeCluster(8)
  registerDoParallel(cluster)
  doStuff <- function(input){                                                                                                                                                                                                                                                                                            
    flog.debug('Doing some stuff with %s', input)                                                                                                                                                                                                                                                                      
    return(input)                                                                                                                                                                                                                                                                                                      
  }                                                                                                                                                                                                                                                                                                                      
  res <- lapply(FUN=doStuff, X=seq(1,8,1))
  # >> this prints                                                                                                                                                                                                                                                                         
  res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input)                                                                                                                                                                                                                                                                
  # >> this prints
  res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)        
  # >> this does not                                                                                                                                                                                                                          
  identical(res,res2) && identical(res,res3)

我并不真正关心并行后端,可以是任何东西,但我怎样才能让日志记录正常工作

【问题讨论】:

    标签: r logging parallel-foreach


    【解决方案1】:

    遵循How can I print when using %dopar% 的解决方案:想法是使用snow 设置您的集群,并设置outfile="" 将worker 输出重定向到master。

    library(foreach)
    library(futile.logger)
    library(doParallel)
    
    library(doSNOW)
    cluster <- makeCluster(3, outfile="") # I only have 4 cores, but you could do 8
    registerDoSNOW(cluster)
    flog.threshold(DEBUG)
    
    doStuff <- function(input){
      flog.info('Doing some stuff with %s', input) # change to flog.info
      return(input) 
      } 
    res <- lapply(FUN=doStuff, X=seq(1,8,1))
    # >> this prints                                                              
    res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input) 
    # >> this prints
    res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
    # >> this prints too
    

    输出:

    > res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
    Type: EXEC 
    Type: EXEC 
    Type: EXEC 
    Type: EXEC 
    Type: EXEC 
    Type: EXEC 
    INFO [2016-08-08 08:22:39] Doing some stuff with 3
    Type: EXEC 
    INFO [2016-08-08 08:22:39] Doing some stuff with 1
    INFO [2016-08-08 08:22:39] Doing some stuff with 2
    Type: EXEC 
    Type: EXEC 
    INFO [2016-08-08 08:22:39] Doing some stuff with 5
    INFO [2016-08-08 08:22:39] Doing some stuff with 4
    Type: EXEC 
    Type: EXEC 
    INFO [2016-08-08 08:22:39] Doing some stuff with 6
    INFO [2016-08-08 08:22:39] Doing some stuff with 7
    INFO [2016-08-08 08:22:39] Doing some stuff with 8
    

    输出到日志文件。这是一个输出到日志文件的替代方法,跟在How to log using futile logger from within a parallel method in R? 之后。它的优点是输出更干净,但仍然需要flog.info

    library(doSNOW)
    library(foreach)
    library(futile.logger)
    nworkers <- 3
    cluster <- makeCluster(nworkers)
    registerDoSNOW(cluster)
    loginit <- function(logfile) flog.appender(appender.file(logfile))
    foreach(input=rep('~/Desktop/out.log', nworkers), 
      .packages='futile.logger') %dopar% loginit(input)
    doStuff <- function(input){
      flog.info('Doing some stuff with %s', input)
      return(input) 
      } 
    foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input) 
    stopCluster(cluster)
    readLines("~/Desktop/out.log")
    

    输出:

    > readLines("~/Desktop/out.log")
    [1] "INFO [2016-08-08 10:07:30] Doing some stuff with 2"
    [2] "INFO [2016-08-08 10:07:30] Doing some stuff with 1"
    [3] "INFO [2016-08-08 10:07:30] Doing some stuff with 3"
    [4] "INFO [2016-08-08 10:07:30] Doing some stuff with 4"
    [5] "INFO [2016-08-08 10:07:30] Doing some stuff with 5"
    [6] "INFO [2016-08-08 10:07:30] Doing some stuff with 6"
    [7] "INFO [2016-08-08 10:07:30] Doing some stuff with 7"
    [8] "INFO [2016-08-08 10:07:30] Doing some stuff with 8"
    

    【讨论】:

    • 第二种方法的问题是您没有登录控制台...是吗?
    • 使用parallel的全部意义在于抽象出snowmulticore...
    • 这似乎不起作用。在带有 R 3.4.2 的 Ubuntu 上运行。我根本没有向控制台输出日志。
    猜你喜欢
    • 1970-01-01
    • 2017-11-13
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-05-27
    • 2017-01-30
    • 1970-01-01
    相关资源
    最近更新 更多