按字符数拆分列[重复]答案

【问题标题】：Splitting Columns by Number of Characters [duplicate]按字符数拆分列[重复]
【发布时间】：2016-07-07 18:28:59
【问题描述】：

我在以 6 位数字输入的数据表中有一列日期：201401, 201402, 201403, 201412, etc.，其中前 4 位数字是年份，后两位数字是月份。

我试图将该列分成两列，一列称为“年”，一列称为“月”。一直在搞乱strsplit()，但不知道如何让它处理字符数而不是字符串模式，即在第 4 位和第 5 位中间分割。

【问题讨论】：

标签： r

【解决方案1】：

不使用任何外部包，我们可以用substr做到这一点

transform(df1, Year = substr(dates, 1, 4), Month = substr(dates, 5, 6))
#    dates Year Month
#1  201401 2014    01
#2  201402 2014    02
#3  201403 2014    03
#4  201412 2014    12

我们可以选择删除或保留该栏。

或sub

cbind(df1, read.csv(text=sub('(.{4})(.{2})', "\\1,\\2", df1$dates), header=FALSE))

或者使用一些包解决方案

library(tidyr)
extract(df1, dates, into = c("Year", "Month"), "(.{4})(.{2})", remove=FALSE)

或者用data.table

library(data.table)
setDT(df1)[, tstrsplit(dates, "(?<=.{4})", perl = TRUE)]

【讨论】：

【解决方案2】：

tidyr::separate 可以为其sep 参数取一个整数，它将在特定位置拆分：

library(tidyr)

df <- data.frame(date = c(201401, 201402, 201403, 201412))

df %>% separate(date, into = c('year', 'month'), sep = 4)
#>   year month
#> 1 2014    01
#> 2 2014    02
#> 3 2014    03
#> 4 2014    12

注意新列是字符；添加convert = TRUE 强制返回数字。

【讨论】：

这段代码列出了终端中的数据框，尝试使用以下方法创建一个新的数据框，拆分列 df2
好吧，如果你想保留它，请将它分配给一个变量。