【问题标题】:how to "translate" values of a vector into another vector in R如何将向量的值“翻译”成R中的另一个向量
【发布时间】:2014-05-19 19:04:33
【问题描述】:

如何翻译 CLASS 列,以便得到一个新列 CLASS2,其中“1”=“positive”、“-1”=“negative”、“0”=“neutral”。我知道这是一个非常基本的问题,我认为可以为此使用ifelse()。但我就是不知道如何正确使用该功能。

DATE <- c("01.01.2000","02.01.2000","03.01.2000","06.01.2000","07.01.2000","09.01.2000","10.01.2000","01.01.2000","02.01.2000","04.01.2000","06.01.2000","07.01.2000","09.01.2000","10.01.2000")
RET <- c(-2.0,1.1,3,1.4,-0.2, 0.6, 0.1, -0.21, -1.2, 0.9, 0.3, -0.1,0.3,-0.12)
CLASS <- c("1","-1","0","1","1","-1","0","1","-1","-1","1","0","0","0")
df <- data.frame(DATE, RET, CLASS)

df

输出应如下所示:

DATE <- c("01.01.2000","02.01.2000","03.01.2000","06.01.2000","07.01.2000","09.01.2000","10.01.2000","01.01.2000","02.01.2000","04.01.2000","06.01.2000","07.01.2000","09.01.2000","10.01.2000")
RET <- c(-2.0,1.1,3,1.4,-0.2, 0.6, 0.1, -0.21, -1.2, 0.9, 0.3, -0.1,0.3,-0.12)
CLASS <- c("1","-1","0","1","1","-1","0","1","-1","-1","1","0","0","0")
CLASS2 <- c("positive", "negative", "neutral", "positive", "positive", "negative", "neutral", "positive", "negative", "negative", "positive", "neutral", "neutral", "neutral")
df <- data.frame(DATE, RET, CLASS, CLASS2)

df

#          DATE   RET CLASS   CLASS2
# 1  01.01.2000 -2.00     1 positive
# 2  02.01.2000  1.10    -1 negative
# 3  03.01.2000  3.00     0  neutral
# 4  06.01.2000  1.40     1 positive
# 5  07.01.2000 -0.20     1 positive
# 6  09.01.2000  0.60    -1 negative
# 7  10.01.2000  0.10     0  neutral
# 8  01.01.2000 -0.21     1 positive
# 9  02.01.2000 -1.20    -1 negative
# 10 04.01.2000  0.90    -1 negative
# 11 06.01.2000  0.30     1 positive
# 12 07.01.2000 -0.10     0  neutral
# 13 09.01.2000  0.30     0  neutral
# 14 10.01.2000 -0.12     0  neutral

谢谢!

【问题讨论】:

  • 您真的将 CLASS 存储为一个因素吗?这有关系吗?

标签: r if-statement dataframe character


【解决方案1】:

这是一种使用辅助函数和sapply 的简单方法:

translate <- function(x) {
  if (x == '1') {
    'positive'
  } else if (x == '-1') {
    'negative'
  } else {
    'neutral'
  }
}
df <- data.frame(DATE, RET, CLASS, CLASS2=sapply(CLASS, translate))

或者您可以使用ifelse 重写translate 以使其更紧凑:

translate <- function(x) {
  ifelse(x == '1', 'positive', ifelse(x == '-1', 'negative', 'neutral'))
}

这两者都会产生您要求的输出。但可能有更好的方法。

...就像@joran 建议的那样,如果CLASS 是因子类型(可能是):

df$CLASS2 <- c('negative','neutral','positive')[df$CLASS]

正如@beginneR 指出的那样,在我的前两个提案中,您不需要函数。但我喜欢使用函数来提高可读性。

【讨论】:

  • 如果这是一个因素(默认情况下),我会建议c('negative','neutral','positive')[df$CLASS]
  • 我在等你把它添加到你的答案中,所以我可以投票给你。 ;)
【解决方案2】:

这是一个通用方法,可以使用match 处理更多级别:

CLASS2 <- c('positive','negative','neutral')[ match(CLASS, c('1','-1','0') ) ]

【讨论】:

    【解决方案3】:

    你甚至不需要定义函数并使用sapply,只需创建一个新列并直接在其上使用ifelse

    df$Class2 <- with(df, ifelse(CLASS == '1', 'positive', ifelse(CLASS == '-1', 'negative', 'neutral')))
    

    【讨论】:

      【解决方案4】:

      dplyr::case_when 是一个选项:

      df %>%
        mutate(
          CLASS2 = case_when(
            CLASS ==  1 ~ 'positive',
            CLASS ==  0 ~ 'neutral',
            CLASS == -1 ~ 'negative',
            TRUE ~ '?'
          )
        )
      

      超级可读,不是吗?

      虽然如果您在CLASS 中有更多级别,输入所有这些CLASS == 条件会很麻烦。在这种情况下,恕我直言,sapply 确实是最好的选择。或者purrr::map

      > x <- c(-1, -1, 0, 1, -1) %>% as.character()
      > x %>% map(~ list(`-1` = 'negative', `0` = 'neutral', `1` = 'positive')[[.x]]) %>% unlist()
      [1] "negative" "negative" "neutral"  "positive" "negative"
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2015-01-31
        • 2021-03-01
        • 1970-01-01
        • 1970-01-01
        • 2019-10-06
        • 2018-09-06
        • 1970-01-01
        相关资源
        最近更新 更多