【问题标题】:merge 2 dataframe with same but different case column in R在R中合并2个具有相同但不同案例列的数据框
【发布时间】:2017-07-14 11:06:58
【问题描述】:

我有两个数据框,但问题是合并“按”列在不同情况下具有值。

sn1capx1e0001 与 SN1CAPX1E0001。

authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
nationality = c("US", "Australia", "US", "UK", "Australia"),
deceased = c("yes", rep("no", 4)))

books <- data.frame(
name = I(c("tukey", "venables", "tierney",
           "tipley", "ripley", "McNeil", "R Core")),
title = c("Exploratory Data Analysis",
          "Modern Applied Statistics ...",
          "LISP-STAT",
          "Spatial Statistics", "Stochastic Simulation",
          "Interactive Data Analysis",
          "An Introduction to R"),
other.author = c(NA, "Ripley", NA, NA, NA, NA,
                 "Venables & Smith"))
m1 <- merge(authors, books, by.x = "surname", by.y = "name")

给予

姓国籍已故头衔其他作者

McNeil 澳大利亚没有交互式数据分析 NA

所以我想通过不区分大小写来合并它们。我无法使用合并或加入。

我看到我们可以使用正则表达式通过循环来匹配值。

【问题讨论】:

    标签: r


    【解决方案1】:

    我发现这很简单

    使用 "toupper()" 隐藏两者

    books$name<-toupper(books$name) 
    

    简单....

    【讨论】:

      【解决方案2】:

      为什么不将它们转换成相同的形式?

      library(stringr)
      
      authors <- data.frame(
        surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
        nationality = c("US", "Australia", "US", "UK", "Australia"),
        deceased = c("yes", rep("no", 4)))
      
      books <- data.frame(
        name = I(c("tukey", "venables", "tierney",
                   "tipley", "ripley", "McNeil", "R Core")),
        title = c("Exploratory Data Analysis",
                  "Modern Applied Statistics ...",
                  "LISP-STAT",
                  "Spatial Statistics", "Stochastic Simulation",
                  "Interactive Data Analysis",
                  "An Introduction to R"),
        other.author = c(NA, "Ripley", NA, NA, NA, NA,
                         "Venables & Smith"))
      
      authors$surname <- str_to_title(authors$surname)
      books$name <- str_to_title(books$name)
      
      m1 <- merge(authors, books, by.x = "surname", by.y = "name")
      

      给予

         surname nationality deceased                         title other.author
      1   Mcneil   Australia       no     Interactive Data Analysis         <NA>
      2   Ripley          UK       no         Stochastic Simulation         <NA>
      3  Tierney          US       no                     LISP-STAT         <NA>
      4    Tukey          US      yes     Exploratory Data Analysis         <NA>
      5 Venables   Australia       no Modern Applied Statistics ...       Ripley
      

      【讨论】:

        猜你喜欢
        • 2021-03-11
        • 2017-05-18
        • 2016-08-12
        • 1970-01-01
        • 2020-07-20
        • 1970-01-01
        • 1970-01-01
        • 2021-03-20
        • 1970-01-01
        相关资源
        最近更新 更多