【问题标题】:How to run a subset of a data frame through R function?如何通过 R 函数运行数据帧的子集?
【发布时间】:2020-09-17 11:14:38
【问题描述】:

我正在尝试设置一个函数,该函数对来自我的主数据框的特定子集的选择变量(data.var1,data.var2)进行回归分析,但该函数在整个数据框上运行,而不是不仅仅是我想要的子集,无论我是在函数内部还是外部定义子集。

#Function subsetting data by temp and running regression
varReg21C <- function(data.var1,data.var2) {
  data21C <- subset(allPursuit,allPursuit$temp == 21)
  fitData <- lm(data.var1 ~ data.var2, data21C)
  regData <- summary(fitData)
  anovaData <- anova(fitData)
  reg21C <- list(fitData=fitData,regData=regData,anovaData=anovaData)
}

#OR

#Function running regression on data already in subset
data21C <- subset(allPursuit,allPursuit$temp == 21)
data21C
data25C <- subset(allPursuit,allPursuit$temp == 25)
data25C
data29C <- subset(allPursuit,allPursuit$temp == 29)
data29C

varReg21C <- function(data.var1,data.var2) {
  fitData <- lm(data.var1 ~ data.var2, data21C)
  regData <- summary(fitData)
  anovaData <- anova(fitData)
  reg21C <- list(fitData=fitData,regData=regData,anovaData=anovaData)
}

【问题讨论】:

    标签: r function subset


    【解决方案1】:

    您的环境中很可能有 data.var1data.var2 浮动。您需要定义所有输入并防止直接从环境中读取,所以如果我们这样写:

    varReg <- function(data.var1,data.var2,DataFrame,TempChoice) {
      data <- subset(DataFrame,temp == TempChoice)
      Form <- paste(data.var1,"~",data.var2)
      fitData <- lm(Form, data)
      regData <- summary(fitData)
      anovaData <- anova(fitData)
      return( list(fitData=fitData,regData=regData,anovaData=anovaData))
    }
    
    allPursuit = data.frame(x=runif(100),y=runif(100),z=runif(100),
    temp=sample(c(21,25,29),100,replace=TRUE))
    
    varReg("x","y",allPursuit,21)[[1]]
    
    Call:
    lm(formula = Form, data = data)
    
    Coefficients:
    (Intercept)            y  
         0.3126       0.1494 
    
    varReg("x","y",allPursuit,25)[[1]]
    
    Call:
    lm(formula = Form, data = data)
    
    Coefficients:
    (Intercept)            y  
        0.55069      0.01734  
    

    【讨论】:

    • 非常感谢您的帮助!我尝试了你的函数,但遇到了这个错误:&gt; varReg(lunge.time,lunge.dist,allPursuit,21)[[1]] Error in paste(data.var1, "~", data.var2) : object 'lunge.time' not found 即使你可以在我的数据框中看到“lunge.time”在那里:$ lunge.time : int 84 84 92 76 84 100 68 80 120 85...
    • 应该是varReg("lunge.time","lunge.dist",allPursuit,21)[[1]],注意双引号"
    • 啊,好吧,解决了。谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-09-02
    • 2021-12-16
    • 2020-04-03
    • 2015-12-23
    • 2020-02-17
    • 1970-01-01
    • 2018-06-07
    相关资源
    最近更新 更多