【发布时间】:2019-04-10 09:04:50
【问题描述】:
我有一个 data.table
head(LocalCodes, n= 20)
Local Codes
1: Crane, Indiana 0189
2: Rutland, Vermont 0401
3: NA 5003
4: Naval Air Station Patuxent River, Maryland 5001
5: Williamsburg, Virginia 7408
6: District of Columbia, District of Columbia 0132
7: Newport, Rhode Island 1702
8: NA 1805
9: NA 5306
10: Washington DC, District of Columbia / Kansas City, Missouri 2210
11: Kansas City, Missouri 0503
12: Arlington, Virginia 0501
13: Phoenix, Arizona 0301
14: Washington DC, District of Columbia 0132
15: NA 5001
16: Collbran, Colorado 0303
17: Washington DC, District of Columbia / Norfolk, Virginia 1102
18: Minot, North Dakota 1802
19: Washington DC, District of Columbia 2005
20: Pine Knot, Kentucky 4749
我正在尝试使用Good <- LocalCodes[ , list( LocalCodes$Local <- unlist( strsplit( LocalCodes$Local , " / " ) ) , by=LocalCodes$Codes)]
在“/”上拆分Local,并在新数据表中保持相同的Codes。
我不断收到错误Error in strsplit(LocalCodes$Local, " / ") : non-character argument
我确实尝试将as.character(LocalCodes$Local) 添加到Good 以消除错误,但随后 data.table 工作不正确。它将Local 分开,但随后Codes 不排队,因为Local 现在是一个字符。
有没有办法将Local 分开并在正确的Local 上维护Codes
示例:
Local Codes
8: NA 1805
9: NA 5306
10: Kansas City, Missouri 2210
11: Washington DC, District of Columbia 2210
12: Kansas City, Missouri 0503
13: Arlington, Virginia 0501
14: Phoenix, Arizona 0301
15: Washington DC, District of Columbia 0132
16: NA 5001
17: Collbran, Colorado 0303
18: Norfolk, Virginia 1102
19: Washington DC, District of Columbia 1102
使用:Plyr、Dplyr、Data.Table
编辑: 这是 dput 输出:
dput(head(LocalCodes, n= 20))
structure(list(Local = list("Crane, Indiana", "Rutland, Vermont",
"NA", "Naval Air Station Patuxent River, Maryland", "Williamsburg, Virginia",
"District of Columbia, District of Columbia", "Newport, Rhode Island",
"NA", "NA", "Washington DC, District of Columbia / Kansas City, Missouri",
"Kansas City, Missouri", "Arlington, Virginia", "Phoenix, Arizona",
"Washington DC, District of Columbia", "NA", "Collbran, Colorado",
"Washington DC, District of Columbia / Norfolk, Virginia",
"Minot, North Dakota", "Washington DC, District of Columbia",
"Pine Knot, Kentucky"), Codes = list("0189", "0401", "5003",
"5001", "7408", "0132", "1702", "1805", "5306", "2210", "0503",
"0501", "0301", "0132", "5001", "0303", "1102", "1802", "2005",
"4749")), class = c("data.table", "data.frame"), row.names = c(NA,
-20L)
【问题讨论】:
-
如果您发布了
dput( head(LocalCodes, n= 20) )而不是控制台表示,人们将能够更容易地重建该对象。就目前而言,我需要在计算间距后运行read.fwf(我觉得这很痛苦,所以我不这样做。) -
我已经添加了
dput输出。 -
我的回答没有成功,其中不止一项包含“/”。我制定了处理您的 data.table 对象的变体的策略,但在此过程中发现 your 结构很遗憾是非标准的。典型的 data.table 不是列表列表。这种结构因弄乱 data.frame 操作而臭名昭著,而且显然也弄乱了 data.table 操作。您应该首先在 SO 中搜索修复格式错误的 data.table 对象的方法。
标签: r dplyr data.table