【问题标题】:Problems using dplyr in a function (group_by)在函数中使用 dplyr 的问题 (group_by)
【发布时间】:2015-03-25 07:49:55
【问题描述】:

我想使用 dplyr 进行一些数据操作。背景:我有一个调查权重和一堆变量(主要是 likert-items)。我想对有和没有调查权重的每个类别的频率和百分比求和。

例如,让我们只使用性别变量的频率。结果应该是这样的:

 gender freq    freq.weighted
    1       292     922.2906
    2       279     964.7551
    9         6      21.7338

我将为许多变量执行此操作。所以,我决定将 dplyr 代码放在一个函数中,所以我只需要更改变量并减少输入。

#exampledata
gender<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2","2","2","2","2","2","2","2")
survey_weight<-c("2.368456","2.642901","2.926698","3.628653","3.247463","3.698195","2.776772","2.972387","2.686365","2.441820","3.494899","3.133106","3.253514","3.138839","3.430597","3.769577","3.367952","2.265350","2.686365","3.189538","3.029999","3.024567","2.972387","2.730978","4.074495","2.921552","3.769577","2.730978","3.247463","3.230097")
test_dataframe<-data.frame(gender,survey_weight)

#function
weighting.function<-function(dataframe,variable){
  test_weighted<- dataframe %>% 
    group_by_(variable) %>% 
    summarise_(interp(freq=count(~weight)),
               interp(freq_weighted=sum(~weight)))
  return(test_weighted)
}

result_dataframe<-weighting.function(test_dataframe,"gender")

#this second step was left out in this example:
#mutate_(perc=interp(~freq/sum(~freq)*100),perc_weighted=interp(~freq_weighted/sum(~freq_weighted)*100))

这会导致以下错误消息:

Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "formula" 

我尝试了很多不同的东西。首先,我使用freq=n() 来计算频率,但我总是得到一个错误(我检查过,plyr 是在 dplyr 之前加载的,而不是之后加载的 - 它也没有工作。)。

有什么想法吗?我阅读了有关标准评估的小插图。但是,我总是遇到问题,不知道有什么解决办法。

【问题讨论】:

    标签: r function plyr dplyr


    【解决方案1】:

    我认为您有一些嵌套错误导致您出现问题。最大的一个是使用count() 而不是summarise()。我猜你想要n()

    weighting.function <- function(dataframe, variable){
      dataframe %>% 
        group_by_(variable) %>% 
        summarise_(
          freq = ~n(),
          freq_weighted = ~sum(survey_weight)
        )
    }
    
    weighting.function(test_dataframe, ~gender)
    

    您还对interp() 进行了一些不必要的使用。如果你使用interp(),调用应该看起来像freq = interp(~n()),即名字在interp调用之外,被插值的东西以~开头。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-07-12
      • 2018-11-02
      • 1970-01-01
      • 1970-01-01
      • 2015-11-18
      • 2014-03-06
      相关资源
      最近更新 更多