【问题标题】:Unable to move R analyses output back to Python (rpy2)无法将 R 分析输出移回 Python (rpy2)
【发布时间】:2023-07-10 17:13:01
【问题描述】:

我正在尝试将一些数据从 python 传递到 R,然后将结果重新调整到 python,但似乎无法让它工作。

我成功地将我的数据传递给 R 并在数据上运行我的自定义函数,甚至获得输出。我被卡住的地方是将统计输出作为数据框返回到 python 中。我尝试过使用 rpy2,甚至将其导出到 .csv 文件以重新导入,但无法让任何一种方法起作用。当我尝试将其推回 pandas 时,我收到一个无法强制执行的错误。当谈到保存到 .csv 时,我似乎无法使用我的“结果”对象让它工作。在阅读中,检查 R 全局环境中的内容似乎可以帮助我弄清楚,但我也无法弄清楚如何做到这一点。

感谢任何有用的 cmets。


#import statements
import rpy2
print(rpy2.__version__)
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
base = importr('base')
utils = importr('utils')
name = 'test_subject'

#Sample data to analyze
list1 = [0,1,2,3,4,5,6,7,8,9,10] # analysis window
list2 = [1,5,6,8,7,9,10,8,7,6,3] # nnumber of responses per bin

#Convert data to R objects
set1 = robjects.IntVector(list1)
set2 = robjects.IntVector(list2)

makeDataFrame = robjects.r('''data.frame ''')
df = makeDataFrame(x = set1, y = set2)


# Create curve fitting function
curve_fit = robjects.r('''
curve_fit <- function(df, plot = FALSE){ control <- nls.control(maxiter = 1000, tol = 0.000100, minFactor = 1/2064,
                         printEval = FALSE, warnOnly = TRUE)
  
  fit <- nls(y ~ d+a*exp(-.5*((x-t0)/b)^2)+c*(x-t0), 
             data = df,
             start = list(a = 1, b = 10, t0 = 10, c = 1, d = 1),
             algorithm = "port",
             control = control)
  
  if (plot){
    fitFnc <- function(x) predict(fit, list(x=x))
    par(mfrow = c(1, 1))
    plot(df$x, df$y, xlim = c(0,45))
    curve(fitFnc, from=.5, to=45, add = TRUE)
  }

  return(list("params" = summary(fit), 
              "r2" = cor(predict(fit), df$y)^2))
              }''')

#run function on data
results = curve_fit(df, plot = True)

#Show Results
print('results', results)
print(type(results))

【问题讨论】:

    标签: python r pandas rpy2


    【解决方案1】:

    问题出在

    return(list("params" = summary(fit), "r2" = cor(predict(fit), df$y)^2))
    

    列表“params”中的第一项是来自 R 的汇总表。虽然这是在 python 中作为我想要的数据打印的,但它是一个无法细分的单个对象,因为它本质上是一个 R 输出表的图像.我需要返回的是如下代码所示的数据框。

    return(data.frame(coef(summary(fit)), r2 = cor(predict(fit), df$y)^2))
    

    这返回了一个对象列表,然后我可以将其转换为 numpy 数组并在 python 中进行操作。

    这是完整的代码。

    #import statements
    import rpy2
    print(rpy2.__version__)
    import rpy2.robjects as robjects
    from rpy2.robjects.packages import importr
    import rpy2.robjects.numpy2ri
    import numpy as np
    rpy2.robjects.numpy2ri.activate()
    base = importr('base')
    utils = importr('utils')
    
    
    
    
    #Sample data to analyze
    list1 = [0,1,2,3,4,5,6,7,8,9,10] # analysis window
    list2 = [1,5,6,8,7,9,10,8,7,6,3] # nnumber of responses per bin
    
    #Convert data to R objects and place in data frame
    set1 = robjects.IntVector(list1)
    set2 = robjects.IntVector(list2)
    
    makeDataFrame = robjects.r('''data.frame ''')
    df = makeDataFrame(x = set1, y = set2)
    
    
    # Create curve fitting function in r
    curve_fit = robjects.r('''
    
    #Fit function
    curve_fit <- function(df, plot = FALSE){ control <- nls.control(maxiter = 1000, tol = 0.000100, minFactor = 1/2064,
                             printEval = FALSE, warnOnly = TRUE)
    #Specify formula to fit  
      fit <- nls(y ~ d+a*exp(-.5*((x-t0)/b)^2)+c*(x-t0), 
                 data = df,
                 start = list(a = 1, b = 10, t0 = 10, c = 1, d = 1),
                 algorithm = "port",
                 control = control)
                 
    # Create plot of curve 
      if (plot){
        fitFnc <- function(x) predict(fit, list(x=x))
        par(mfrow = c(1, 1))
        plot(df$x, df$y, xlim = c(0,45))
        curve(fitFnc, from=.5, to=45, add = TRUE)}
    
        #returns data in R dataframe
      return(data.frame(coef(summary(fit)), r2 = cor(predict(fit), df$y)^2))
    
                   }''')
    
    
    #run function on data
    results = curve_fit(df, plot = True)
    
    results = np.array(results) #convert to numpy array
    
    #Show Results
    print('results', results)
    print(type(results))
    

    【讨论】: