【发布时间】:2021-01-08 16:21:41
【问题描述】:
我有以下非结构化票务数据集,其中包含工作说明更新。每张工单都有多个基于时间戳的工作笔记。我需要将 Work notes 列拆分为具有时间戳的每一行及其对应的更新,类似于 Expected output
中显示的更新I.NO Ticket No: Worknotes
0 198822 2015-06-19 01:57:11 -Account Service
1 198822 Event closed
2 198822 Acknowledged
3 198822 2015-06-19 01:58:33- Lawrence David
4 198822 Data unavialable and hence ticket closed
5 198824 2015-06-19 02:07:01- Account Service
6 198824 User requested for database information
7 198824 2015-06-19 02:07:34- Cecilia Trandau
8 198824 Backup in progress. Under discusion
9 198824 2015-06-20 02:07:01- Account Service
10 198824 Auto closed
########## Edited **Output of dput**
structure(list(I.NO = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10), `Ticket No:` = c(198822,
198822, 198822, 198822, 198822, 198824, 198824, 198824, 198824,
198824, 198824), Worknotes = c("2015-06-19 01:57:11 -Account Service",
"Event closed", "Acknowledged", "2015-06-19 01:58:33- Lawrence David",
"Data unavialable and hence ticket closed", "2015-06-19 02:07:01- Account Service",
"User requested for database information", "2015-06-19 02:07:34- Cecilia Trandau",
"Backup in progress. Under discusion", "2015-06-20 02:07:01- Account Service",
"Auto closed")), row.names = c(NA, -11L), class = c("tbl_df",
"tbl", "data.frame"))
# A tibble: 6 x 3
I.NO `Ticket No:` Worknotes
<dbl> <dbl> <chr>
1 0 198822 2015-06-19 01:57:11 -Account Service
2 1 198822 Event closed
3 2 198822 Acknowledged
4 3 198822 2015-06-19 01:58:33- Lawrence David
5 4 198822 Data unavialable and hence ticket closed
6 5 198824 2015-06-19 02:07:01- Account Service
###########################
**Expected Output**
**Ticket No:** **Worknotes**
198822 2015-06-19 01:57:11 -Account Service
Event closed
Acknowledge
198822 2015-06-19 01:58:33- Lawrence David
Data unavailable and hence ticket closed
198824 2015-06-19 02:07:01- Account Service
User requested for database information
198824 2015-06-19 02:07:34- Cecilia Trandau
Backup in progress. Under discusion
198824 2015-06-20 02:07:01- Account Service
Auto closed
【问题讨论】:
-
您能解释一下您预期输出的数据结构吗?对于每个可以包含多个值的票号,工作笔记列中是否应该有一个向量?一个列表?还是在一张 19822 票的表中有三行,其中两行在“票号”列中根本没有条目?
-
嗨,Akshi,请不要基本上发帖 the same question 两次。正如我上次评论的那样,尚不清楚您的数据在 R 中是如何格式化的,因此我们很难提供帮助。请用
head(dput(data))的输出edit 你的问题将data替换为你的数据对象的名称。 -
我已经添加了 dput 输出。谢谢!