【问题标题】:Why does readr store date objects as integer values?为什么 readr 将日期对象存储为整数值?
【发布时间】:2015-11-14 13:59:54
【问题描述】:

使用readr 包读取 csv 文件时,日期对象存储为整数值。当我说存储为整数时,我并不是指日期列的类,而是指 R 存储的基础日期值。如果一个数据框的日期存储为数值,而另一个数据框的日期为整数,这将阻止使用 dplyr 连接函数的能力。我在下面包含了一个可重现的示例。我能做些什么来防止这种行为吗?

library(readr)

df1 <- data.frame(Date = as.Date(c("2012-11-02", "2012-11-04", "2012-11-07", "2012-11-09", "2012-11-11")), Text = c("Why", "Does", "This", "Happen", "?"), stringsAsFactors = F)
class(df1$Date)
# [1] "Date"
dput(df1$Date[1])
# structure(15646, class = "Date")

# Write to dummy csv
write.csv(df1, file = "dummy_csv.csv", row.names = F)

# Read back in data using both read.csv and read_csv
df2 <- read.csv("dummy_csv.csv", as.is = T, colClasses = c("Date", "character"))
df3 <- read_csv("dummy_csv.csv")

# Examine structure of date values
class(df2$Date)
# [1] "Date"
class(df3$Date)
# [1] "Date"

dput(df2$Date[1])
# structure(15646, class = "Date")
dput(df3$Date[1])
# structure(15646L, class = "Date")

# Try to join using dplyr joins
both <- full_join(df2, df3, by = c("Date"))
Error: cannot join on columns 'Date' x 'Date': Cant join on 'Date' x 'Date' because of incompatible types (Date / Date) 

# Base merge works
both2 <- merge(df2, df3, by = "Date")

# converting a POSIXlt object to Date is also stored as numeric
temp_date <- as.Date(as.POSIXct("11OCT2012:19:00:00", format = "%d%b%Y:%H:%M:%S"))
dput(temp_date)
# structure(15624, class = "Date")

this issuedplyr repo 上判断,Hadley 似乎认为这是一项功能,但任何时候您的日期值存储方式不同,您都无法合并它们,而且我还没有找到一种方法将整数日期对象转换为数字对象。有没有办法阻止 readr 包这样做或以任何方式将存储为整数的 Date 对象转换为数值?

【问题讨论】:

    标签: r dplyr readr


    【解决方案1】:

    根据the big man himself 这是dplyr 不是readr 的错误。他说在读取文件时存储数字和整数值是可以的,但 dplyr 应该能够像merge 那样处理差异。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-05-08
      • 1970-01-01
      • 2020-11-26
      • 1970-01-01
      • 2021-05-16
      • 1970-01-01
      • 1970-01-01
      • 2014-11-08
      相关资源
      最近更新 更多