设置条件 group_by答案

【问题标题】：Setting up a conditional group_by设置条件 group_by
【发布时间】：2020-04-07 22:26:43
【问题描述】：

我有一组看起来像这样的数据：

+----------+------------+-------+-------+
|  step1   |   step2    | step3 | step4 |
+----------+------------+-------+-------+
| Region 1 | District A | 1A    |   571 |
| Region 1 | District A | 1A    |   356 |
| Region 1 | District A | 1B    |   765 |
| Region 1 | District B | 1B    |   752 |
| Region 2 | District C | 2C    |   885 |
| Region 2 | District C | 2D    |    73 |
| Region 2 | District D | 2D    |   241 |
| Region 2 | District D | 2D    |   823 |
| Region 3 | District E | 3E    |   196 |
| Region 3 | District E | 3E    |   103 |
| Region 3 | District F | 3E    |   443 |
| Region 3 | District F | 3F    |   197 |
+----------+------------+-------+-------+

我已经设置了以下脚本，按照它的编写方式，它使用selectizeGroupServer 在 step1、step2 和 step3 之间自动设置过滤，以便它们链接在一起（即，如果您过滤区域 1，它将只返回 Step2 和 Step3 中的相关选项。

如果您希望以直接的方式将我正在寻找的结果返回给group_by_all，下面的脚本会返回结果。所以在初始运行时，它将显示所有 11 个结果的图形输出。如果我按区域 1 过滤，它将返回与区域 1 链接的第 4 步中所有四个数字的图表。

但是我想设置它，当我选择一个选项时，它实际上会按它下面的层次结构选项分组。因此，如果我按区域 1 过滤，它将返回两列：A 区的总和 (1692) 和 B 区的总和 (752)。如果我同时选择了 Region 1 和 District A，它将返回两列：1A 的聚合 (927) 和与 A 区相关的 1B 的聚合 (765)。

我怎样才能以实现此目的的方式进行设置？

library(highcharter)
library(shiny)
library(shinyWidgets)
library(dplyr)

step1 <- c('Region 1', 'Region 1', 'Region 1', 'Region 1', 'Region 2', 'Region 2', 'Region 2', 'Region 2', 'Region 3', 'Region 3', 'Region 3', 'Region 3')
step2 <- c('District A', 'District A', 'District A', 'District B', 'District C', 'District C', 'District D', 'District D', 'District E', 'District E', 'District F', 'District F')
step3 <- c('1A', '1A', '1B', '1B', '2C', '2D', '2D', '2D', '3E', '3E', '3E', '3F')
step4 <- c(571,356,765,752,885,73,241,823,196,103,443,197)

ui <- fluidPage(
  fluidRow(
    column(
      width = 5, offset = 1,
      panel(
        selectizeGroupUI(
          id = "foo",
          params = list(
            Step1 = list(inputId = "step1", title = "Step1:"),
            Step2 = list(inputId = "step2", title = "Step2:"),
            Step3 = list(inputId = "step3", title = "Step3:")
          ))
      ),
      highchartOutput(outputId = "table")
    )
  )
)

server <- function(input, output, session) {

  abc <- callModule(
    module = selectizeGroupServer,
    id = "foo",
    data = df,
    vars = c("step1", "step2", "step3")
  )

  output$table <- renderHighchart({

    bar <- abc()

    xyz <- bar %>% filter(is.null(input$step1) | step1 %in% input$step1,
                        is.null(input$step2) | step2 %in% input$step2,
                        is.null(input$step3) | step3 %in% input$step3) %>% group_by_all() %>% summarise(results = sum(step4))


    highchart() %>% hc_add_series(data = xyz, type = "column", hcaes(y = results),
                                  showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())


  })


}

谢谢！

【问题讨论】：

@akrun 刚刚编辑了脚本，以便轻松放入数据框，或者您认为最适合设置它
啊，是的，当然。谢谢你抓住那个。编辑了我最初的帖子。
因为你已经格式化了，不再需要了
我想了解的一件事是inputs 在这里是如何变化的。在您的情况下，您正在使用| 执行filter。如果您需要一个分层组，使用library(data.table);dt1 <- as.data.table(df1); rollup(dt1, j = sum(step4), by = c("step1", "step2", "step3")) 会更简单，然后使用filter
另外，一个选项是filter 和filter_at

标签： r shiny dplyr shiny-server shiny-reactivity

【解决方案1】：

首先，我们需要找出分组依据的列。在这种情况下，我假设它是具有多个选项的第一列。其余代码非常相似，除了 group_by_all 被 group_by_at 替换。

output$table <- renderHighchart({

        bar <- abc()

        # find out which column to group by (first column with more than 1 distinct value)
        summ_column <- bar %>%
            summarise_all(~ length(unique(.))) %>% {colnames(.)[.>1]} %>% first()

        xyz <- bar %>% group_by_at(summ_column) %>% summarise(results = sum(step4))


        highchart() %>% hc_add_series(data = xyz, type = "column", hcaes(y = results),
                                      showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())


    })

如果您为单个选项选择超过 1 个值，这将不起作用，但该解决方案应该非常相似。

【讨论】：

【解决方案2】：

您似乎正在寻找aggregate。请检查以下内容：

library(highcharter)
library(shiny)
library(shinyWidgets)
# library(dplyr)

DF <- data.frame(
  step1 = c('Region 1', 'Region 1', 'Region 1', 'Region 1', 'Region 2', 'Region 2', 'Region 2', 'Region 2', 'Region 3', 'Region 3', 'Region 3', 'Region 3'),
  step2 = c('District A', 'District A', 'District A', 'District B', 'District C', 'District C', 'District D', 'District D', 'District E', 'District E', 'District F', 'District F'),
  step3 = c('1A', '1A', '1B', '1B', '2C', '2D', '2D', '2D', '3E', '3E', '3E', '3F'),
  step4 = c(571,356,765,752,885,73,241,823,196,103,443,197),
  stringsAsFactors = FALSE)

ui <- fluidPage(
  fluidRow(
    column(
      width = 5, offset = 1,
      panel(
        selectizeGroupUI(
          id = "foo",
          params = list(
            Step1 = list(inputId = "step1", title = "Step1:"),
            Step2 = list(inputId = "step2", title = "Step2:"),
            Step3 = list(inputId = "step3", title = "Step3:")
          ))
      ),
      highchartOutput(outputId = "table")
    )
  )
)

server <- function(input, output, session) {

  abc <- callModule(
    module = selectizeGroupServer,
    id = "foo",
    data = DF,
    vars = c("step1", "step2", "step3")
  )

  output$table <- renderHighchart({
    req(abc())
    bar <- aggregate(step4 ~ step1+step2, abc(), sum)
    highchart() %>% hc_add_series(data = bar, type = "column", hcaes(y = step4), showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())
  })

}

shinyApp(ui, server)

【讨论】：