【问题标题】:Group data by state and get percent with dplyr?按州分组数据并使用 dplyr 获取百分比?
【发布时间】:2022-12-12 07:02:09
【问题描述】:

如何在此数据中按州对客户进行分组并使用 dplyr 获得以下客户百分比?: customer data

colsolarestates <- ColSolare %>% group_by(state) %>% summarise(state = n()) %>% mutate(percent = (state) / sum(state)*100)

【问题讨论】:

  • 低于什么百分比的客户?另外,请不要张贴图片。只需将 dput(Colsolare) 的输出粘贴到问题中(或 dput(head(Colsolare,30))

标签: dplyr


【解决方案1】:

我假设“客户”指的是customer_type 变量。您可以使用 group_by.drop = FALSE 来获取每个州的 customer_type 的百分比,以捕获零。您还可以使用 label_percent 在每个值的末尾放置一个 %。

library(tidyverse)
library(scales)
set.seed(123)
state <- state.abb %>% as.data.frame() %>% rename(state = 1)
ColSolare <- state %>% 
  as.data.frame() %>% 
  rename(state = 1) %>%
  rbind(state, state, state, state) %>%
  mutate(customer_type = sample(
    x = c("bar", "restaurant", "cafe"),
    size = 250,
    replace = TRUE,
    prob = c(.2, .6, .2)
  ))

colsolarestates <- ColSolare %>% 
  mutate(state = factor(state)) %>% mutate(customer_type = factor(customer_type)) %>%
  group_by(state, customer_type, .drop = FALSE) %>% 
  summarise(n = n()) %>% 
  mutate(percent = (n) / sum(n)*100) %>% 
  ungroup() %>%
  mutate(percentlabel = scales::label_percent(scale = 1)(percent))
#> `summarise()` has grouped output by 'state'. You can override using the
#> `.groups` argument.

head(colsolarestates)
#> # A tibble: 6 × 5
#>   state customer_type     n percent percentlabel
#>   <fct> <fct>         <int>   <dbl> <chr>       
#> 1 AK    bar               1      20 20%         
#> 2 AK    cafe              1      20 20%         
#> 3 AK    restaurant        3      60 60%         
#> 4 AL    bar               1      20 20%         
#> 5 AL    cafe              0       0 0%          
#> 6 AL    restaurant        4      80 80%

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2013-10-03
    • 1970-01-01
    • 1970-01-01
    • 2015-10-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多