【问题标题】:Convert time column in character format to manipulable time format in R将字符格式的时间列转换为 R 中可操作的时间格式
【发布时间】:2018-11-05 22:44:16
【问题描述】:

我的问题是关于 b 列的标准化。我需要将这些数据采用一种更易于构建图形的格式。

a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant", "Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen") 
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")

ab <- cbind.data.frame(a,b)

                               a    b
1    Jackson Brice / The Shocker 2:30
2                 Flash Thompson 2:15
3                 Mr. Harrington    2
4                     Mac Gargan 1:15
5                    Betty Brant 1:15
6                 Ann Marie Hoag    1
7 Steve Rogers / Captain America    1
8                   Pepper Potts  :45
9                          Karen    v

作为输出:

                            a        b
1    Jackson Brice / The Shocker 00:02:30
2                 Flash Thompson 00:02:15
3                 Mr. Harrington 00:02:00
4                     Mac Gargan 00:01:15
5                    Betty Brant 00:01:15
6                 Ann Marie Hoag 00:01:00
7 Steve Rogers / Captain America 00:01:00
8                   Pepper Potts 00:00:45
9                          Karen 00:00:00

如果可能,列 b 的对象采用时间的可操作格式。

【问题讨论】:

  • Steve Rogers / Captain America 1 行似乎不正确。该值实际上是您定义的向量b中的:55

标签: r datetime time timestamp lubridate


【解决方案1】:

可以实现使用tidyr::separatetidyr::unite 的解决方案。方法是首先将包含alphabetic 的值替换为00:00:00。将部分分成 3 列。使用 dplyr::mutate_at 将所有 3 列更改为 00 格式。最后,将所有三列合并。

library(tidyverse)

ab %>% mutate_if(is.factor, as.character) %>%  #Change any factor in character
  mutate(b = ifelse(grepl("[[:alpha:]]", b), "00:00:00", b)) %>%
  mutate(b = ifelse(grepl(":", b), b, paste(b,"00",sep=":")) ) %>%
  separate(b, into = c("b1", "b2", "b3"), sep = ":", fill="left", extra = "drop") %>%
  mutate_at(vars(starts_with("b")), 
      funs(sprintf("%02d", as.numeric(ifelse(is.na(.) | . == "",0,.))))) %>%
  unite("b", starts_with("b"), sep=":")

#                                a        b
# 1    Jackson Brice / The Shocker 00:02:30
# 2                 Flash Thompson 00:02:15
# 3                 Mr. Harrington 00:02:00
# 4                     Mac Gargan 00:01:15
# 5                    Betty Brant 00:01:15
# 6                 Ann Marie Hoag 00:01:00
# 7 Steve Rogers / Captain America 00:00:55
# 8                   Pepper Potts 00:00:45
# 9                          Karen 00:00:00

数据:

a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant",
 "Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen") 
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")

ab <- cbind.data.frame(a,b

【讨论】:

    【解决方案2】:

    所以我不得不对你正在尝试做的事情做出一些假设,例如单位和你想用字符值做什么,但希望这个函数能给你一些东西。

    随着时间的推移,最大的挑战是在从文本中解析时需要一些相当明确的规则。正如我的结果,我不得不在函数中放置许多 if 语句以使其工作,但尽可能保持时间格式尽可能一致。

    library(lubridate)
    
    formatTime <- function(x) {
    
        # Check for a : seperator in the text
        if(grepl(":",x, fixed = TRUE)) {
    
            y <- unlist(strsplit(x,":", fixed = TRUE))
    
            # If there is no value before the : then add "00" before the :
            if(y[1]=="") {
                z <- ms(paste("00",y[2],collapse = ":"), quiet=TRUE)
            } else {
                z <- ms(paste(y,collapse = ":"), quiet=TRUE)
            }
        } else { 
    
            # If there is no : then add "00" after the :
            z <- ms(paste(x,"00",collapse = ":"), quiet=TRUE)
        }
    
        # If it did not pare with ms, i.e. it was a character, then assign zero time "00:00"
        if(is.na(z)) z <- ms("0:00")
    
        # Converted to duration due to issues returning period with lapply.  
        # Make dataframe to retun units and name with lapply.
        return(data.frame(time = as.duration(z)))
    }
    
    # Convert factor variable to character
    ab$b <- as.character(ab$b)
    
    ab <- cbind(ab,rbindlist(lapply(ab$b,formatTime)))
    

    我首先尝试使用时间段,但应用语句无法正确返回,因此我转换为持续时间。这可能与您的示例显示不同,但它应该与图表很好地配合。
    如果我错过了您需要的内容,请告诉我,我会更新答案。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-03-15
      • 1970-01-01
      • 1970-01-01
      • 2015-03-18
      • 1970-01-01
      • 1970-01-01
      • 2011-05-29
      • 1970-01-01
      相关资源
      最近更新 更多