【问题标题】:IF_ELSE statement not working as expectedIF_ELSE 语句未按预期工作
【发布时间】:2020-03-05 22:22:27
【问题描述】:

我正在尝试根据对许多其他变量的条件评估来创建一个新变量。我正在使用一些嵌套的“if_else”语句,但只有部分条件语句正在按照我的意愿进行评估。

这是一些示例数据的 dput:

structure(list(`Cultivation` = c("No", "No", "Yes", 
"Yes", "No", "Yes", "No", "No", "No", "No", "Yes", "Yes"), 
`Processing` = c("No", 
"No", "Yes", "Yes", "No", "No", "No", "No", "No", "No", "No", 
"Yes"), `Federal Sales` = c("No", "No", "Yes", "Yes", "Yes", 
"Yes", "No", "No", "No", "No", "Yes", "Yes"), `Cultivation 
Type` = c(NA, 
NA, "Standard", "Standard", NA, "Micro", NA, NA, NA, NA, "Nursery", 
"Standard"), `Processing Type` = c(NA, NA, "Standard", 
"Standard", NA, NA, NA, NA, NA, NA, NA, "Standard"), `Type` = c(NA, 
NA, "Standard", "Standard", NA, "Micro", NA, NA, NA, NA, NA, 
"Standard")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-12L))

这是我正在使用的代码:

DF.2 <- DF.1 %>%
  dplyr::mutate("Type" = if_else(str_detect(tolower(`Cultivation Type`), 
"micro") |

str_detect(tolower(`Processing Type`), "micro"), "Micro",

if_else(str_detect(tolower(`Cultivation Type`), "standard") |

str_detect(tolower(`Processing Type`), "standard"), "Standard",

if_else(str_detect(tolower(`Cultivation Type`), "nursery"), 
"Nursery","Other"))))

前两个条件得到满足,我得到一个“标准”或“微型”的类型变量,但“nursery”和“其他”没有评估,我得到“NA”。

【问题讨论】:

  • 该列中有NA,需要注意
  • 如果有很多值要替换,一个选项将是一个键/值数据集,然后做一个模糊连接
  • 我认为 NA 可能是我的问题的原因。您知道为什么第 6 行可以正确评估 Micro,因为其中一列也包含“NA”?另外,有没有办法在嵌套的 if_else 语句中处理 NA?

标签: r if-statement dplyr


【解决方案1】:

在您的情况下最好使用case_when 而不是if_else。这里所有NA 都导致Other

library(dplyr)
library(stringr)

DF.2 <- DF.1 %>%
  mutate("Type" = case_when(
    str_detect(tolower(`Cultivation Type`),"micro") | str_detect(tolower(`Processing Type`), "micro") ~ "Micro",
    str_detect(tolower(`Cultivation Type`), "standard") | str_detect(tolower(`Processing Type`), "standard") ~ "Standard",
    str_detect(tolower(`Cultivation Type`), "nursery") ~ "Nursery",
    TRUE ~ "Other")
  )

输出:

> DF.2
# A tibble: 12 x 6
   Cultivation Processing `Federal Sales` `Cultivation Type` `Processing Type` Type    
   <chr>       <chr>      <chr>           <chr>              <chr>             <chr>   
 1 No          No         No              NA                 NA                Other   
 2 No          No         No              NA                 NA                Other   
 3 Yes         Yes        Yes             Standard           Standard          Standard
 4 Yes         Yes        Yes             Standard           Standard          Standard
 5 No          No         Yes             NA                 NA                Other   
 6 Yes         No         Yes             Micro              NA                Micro   
 7 No          No         No              NA                 NA                Other   
 8 No          No         No              NA                 NA                Other   
 9 No          No         No              NA                 NA                Other   
10 No          No         No              NA                 NA                Other   
11 Yes         No         Yes             Nursery            NA                Nursery 
12 Yes         Yes        Yes             Standard           Standard          Standard
> 

【讨论】:

  • 这很棒。我从未使用过 case_when,但它看起来正是我所需要的。将更多地查看文档。谢谢。
  • 很高兴能提供帮助。确实非常方便多case转换!
【解决方案2】:

我们需要对代码进行一些更改以仅返回 TRUE/FALSE,因为 NA 元素仅返回 NA,这可能会导致问题

library(dplyr)
DF.1 %>%
  dplyr::mutate("Type" = if_else((str_detect(tolower(`Cultivation Type`), "micro") | str_detect(tolower(`Processing Type`), "micro")) & !(is.na(`Cultivation Type`) |  is.na(`Processing Type`)), "Micro",

     if_else((str_detect(tolower(`Cultivation Type`), "standard") | str_detect(tolower(`Processing Type`), "standard")) & !(is.na(`Cultivation Type`) | is.na(`Processing Type`)), "Standard",

     if_else(str_detect(tolower(`Cultivation Type`), "nursery") & !is.na(`Cultivation Type`),  "Nursery","Other"))))
# A tibble: 12 x 6
#   Cultivation Processing `Federal Sales` `Cultivation Type` `Processing Type` Type    
#   <chr>       <chr>      <chr>           <chr>              <chr>             <chr>   
# 1 No          No         No              <NA>               <NA>              Other   
# 2 No          No         No              <NA>               <NA>              Other   
# 3 Yes         Yes        Yes             Standard           Standard          Standard
# 4 Yes         Yes        Yes             Standard           Standard          Standard
# 5 No          No         Yes             <NA>               <NA>              Other   
# 6 Yes         No         Yes             Micro              <NA>              Other   
# 7 No          No         No              <NA>               <NA>              Other   
# 8 No          No         No              <NA>               <NA>              Other   
# 9 No          No         No              <NA>               <NA>              Other   
#10 No          No         No              <NA>               <NA>              Other   
#11 Yes         No         Yes             Nursery            <NA>              Nursery 
#12 Yes         Yes        Yes             Standard           Standard          Standard

或者,如果我们需要使用与 OP 帖子中相同的代码,只需替换之前“类型”列中的 NA 并在转换后将替换的值更改为 NA

DF.1 %>% 
    mutate_at(vars(ends_with('Type')), replace_na, 'new') %>% 
   dplyr::mutate("Type" = if_else(str_detect(tolower(`Cultivation Type`), 
 "micro") |

 str_detect(tolower(`Processing Type`), "micro"), "Micro",

 if_else(str_detect(tolower(`Cultivation Type`), "standard") |

 str_detect(tolower(`Processing Type`), "standard"), "Standard",

 if_else(str_detect(tolower(`Cultivation Type`), "nursery"), 
 "Nursery","Other")))) %>% 
   mutate_at(vars(ends_with('Type')), na_if, 'new')

如果我们对其他更简单的选项感兴趣,另一种选择是创建一个 key/val 数据集,然后进行模糊连接

【讨论】:

    猜你喜欢
    • 2017-10-27
    • 2012-08-16
    • 2020-03-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-10-23
    • 2018-06-16
    • 2016-04-08
    相关资源
    最近更新 更多