【发布时间】:2020-02-01 03:04:24
【问题描述】:
我正在尝试阅读并使用格式严重的调试日志。没有一致的分隔符,也没有出现换行符被编码。
我想做的是读入并解析数据,以便为每个日期换行(YYYY-MM-DD 格式)。
我正在尝试在 tidyverse 中工作,但似乎无法获得能够正确解析文件的内容。
有没有办法强制用日期模式分隔行?
这些都不起作用:
library(tidyverse)
Log_File <- read.table("Example.txt", header = F, fill = T, skip = 1, allowEscapes = TRUE)
Log_File <- read_delim("Example.txt", col_names = F, delim = " ", n_max = 2)
Log_File <- read_lines("Example.txt", skip = 1, n_max = -1L, na = character(),
locale = default_locale(), progress = interactive())
> Log_File
V1 V2 V3 V4 V5 V6 V7
1 2019-09-20 14:06:18.952 [Error] [main] > CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:): Error
2 2019-09-20 14:06:18.953 [Error] [main] > AlertService.swift[line:310]-retrieveProfileName(): Unable
3 error : {
4 code : 404,
5 message : Not Found. Could not get object ,
6 status : GET_OBJECT
7 }
8 }, bucket=integration-c5068.appspot.com, data=<7b0a2020 22657272 6f72223a 207b0a20 20202022
9 74206765 74206f62 6a656374 222c0a20 20202022 73746174 7573223a
10 ResponseErrorDomain=com.google.HTTPStatus, ResponseErrorCode=404}
11 2019-09-20 14:06:18.953 [Error] [main] > AlertService.swift[line:314]-retrieveProfileName(): AlertSettings
12 error : {
13 code : 404,
14 message : Not Found. Could not get object ,
15 status : GET_OBJECT
16 }
17 }, bucket=integration-c5068.appspot.com, data=<7b0a2020 22657272 6f72223a 207b0a20 20202022
18 74206765 74206f62 6a656374 222c0a20 20202022 73746174 7573223a
19 ResponseErrorDomain=com.google.HTTPStatus, ResponseErrorCode=404}
20 2019-09-20 14:06:18.957 [Error] [main] > CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:): Error
我知道链接到文本文件是不受欢迎的,所以这里有一些原始文本,希望这可行:
2019-09-20 14:06:18.952 [Error] [main] > CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:): Error occurs when download filestorage data with description: Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist.
2019-09-20 14:06:18.953 [Error] [main] > AlertService.swift[line:310]-retrieveProfileName(): Unable to get AlertSettings Name: Error Domain=FIRStorageErrorDomain Code=-13010 "Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist." UserInfo={object=App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data, ResponseBody={
"error": {
"code": 404,
"message": "Not Found. Could not get object",
"status": "GET_OBJECT"
}
}, bucket=integration-c5068.appspot.com, data=<7b0a2020 22657272 6f72223a 207b0a20 20202022 636f6465 223a2034 30342c0a 20202020 226d6573 73616765 223a2022 4e6f7420 466f756e 642e2020 436f756c 64206e6f 74206765 74206f62 6a656374 222c0a20 20202022 73746174 7573223a 20224745 545f4f42 4a454354 220a2020 7d0a7d>, data_content_type=application/json; charset=UTF-8, NSLocalizedDescription=Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist., ResponseErrorDomain=com.google.HTTPStatus, ResponseErrorCode=404}
2019-09-20 14:06:18.953 [Error] [main] > AlertService.swift[line:314]-retrieveProfileName(): AlertSettings Name object missing: Error Domain=FIRStorageErrorDomain Code=-13010 "Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist." UserInfo={object=App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data, ResponseBody={
"error": {
"code": 404,
"message": "Not Found. Could not get object",
"status": "GET_OBJECT"
}
}, bucket=integration-c5068.appspot.com, data=<7b0a2020 22657272 6f72223a 207b0a20 20202022 636f6465 223a2034 30342c0a 20202020 226d6573 73616765 223a2022 4e6f7420 466f756e 642e2020 436f756c 64206e6f 74206765 74206f62 6a656374 222c0a20 20202022 73746174 7573223a 20224745 545f4f42 4a454354 220a2020 7d0a7d>, data_content_type=application/json; charset=UTF-8, NSLocalizedDescription=Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist., ResponseErrorDomain=com.google.HTTPStatus, ResponseErrorCode=404}
2019-09-20 14:06:18.957 [Error] [main] > CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:): Error occurs when download filestorage data with description: Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist.
这是一个读入的 dput:
Log_File <- read_delim("Example.txt", col_names = F, delim = " ")
Data <- structure(list(X1 = c("2019-09-20", "2019-09-20", "error\": {\n \"code\": 404,\n \"message\": \"Not Found. Could not get object\",\n \"status\": \"GET_OBJECT",
" }", "},", "2019-09-20", "error\": {\n \"code\": 404,\n \"message\": \"Not Found. Could not get object\",\n \"status\": \"GET_OBJECT",
" }", "},", "2019-09-20"), X2 = c("14:06:18.952", "14:06:18.953",
NA, NA, "bucket=integration-c5068.appspot.com,", "14:06:18.953",
NA, NA, "bucket=integration-c5068.appspot.com,", "14:06:18.957"
), X3 = c("[Error]", "[Error]", NA, NA, "data=<7b0a2020", "[Error]",
NA, NA, "data=<7b0a2020", "[Error]"), X4 = c("[main]", "[main]",
NA, NA, "22657272", "[main]", NA, NA, "22657272", "[main]"),
X5 = c(">", ">", NA, NA, "6f72223a", ">", NA, NA, "6f72223a",
">"), X6 = c("CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:):",
"AlertService.swift[line:310]-retrieveProfileName():", NA,
NA, "207b0a20", "AlertService.swift[line:314]-retrieveProfileName():",
NA, NA, "207b0a20", "CloudStorageExtension.swift[line:38]-downloadData(node:storageObj:value:):"
), X7 = c("Error", "Unable", NA, NA, "20202022", "AlertSettings",
NA, NA, "20202022", "Error"), X8 = c("occurs", "to", NA,
NA, "636f6465", "Name", NA, NA, "636f6465", "occurs"), X9 = c("when",
"get", NA, NA, "223a2034", "object", NA, NA, "223a2034",
"when"), X10 = c("download", "AlertSettings", NA, NA, "30342c0a",
"missing:", NA, NA, "30342c0a", "download"), X11 = c("filestorage",
"Name:", NA, NA, "20202020", "Error", NA, NA, "20202020",
"filestorage"), X12 = c("data", "Error", NA, NA, "226d6573",
"Domain=FIRStorageErrorDomain", NA, NA, "226d6573", "data"
), X13 = c("with", "Domain=FIRStorageErrorDomain", NA, NA,
"73616765", "Code=-13010", NA, NA, "73616765", "with"), X14 = c("description:",
"Code=-13010", NA, NA, "223a2022", "Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist.",
NA, NA, "223a2022", "description:"), X15 = c("Object", "Object App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data does not exist.",
NA, NA, "4e6f7420", "UserInfo={object=App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data,",
NA, NA, "4e6f7420", "Object"), X16 = c("App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data",
"UserInfo={object=App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data,",
NA, NA, "466f756e", "ResponseBody={", NA, NA, "466f756e",
"App/Data/Users/U0bGtkevMkc8Z94KFIoYSKy87sS2/Modes/RealMode/Alert/Data"
), X17 = c("does", "ResponseBody={", NA, NA, "642e2020",
NA, NA, NA, "642e2020", "does"), X18 = c("not", NA, NA, NA,
"436f756c", NA, NA, NA, "436f756c", "not"), X19 = c("exist.",
NA, NA, NA, "64206e6f", NA, NA, NA, "64206e6f", "exist.")), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), problems = structure(list(
row = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 4L,
5L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 6L, 7L, 8L, 9L
), col = c("X1", "X1", "X1", "X1", "X1", "X1", "X1", "X1",
"X1", "X1", NA, NA, NA, NA, "X1", "X1", "X1", "X1", "X1",
"X1", "X1", "X1", "X1", "X1", NA, NA, NA, NA), expected = c("delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"19 columns", "19 columns", "19 columns", "19 columns", "delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"delimiter or quote", "delimiter or quote", "delimiter or quote",
"19 columns", "19 columns", "19 columns", "19 columns"),
actual = c(":", "c", ":", "m", ":", "N", ",", "s", ":", "G",
"17 columns", "1 columns", "1 columns", "40 columns", ":",
"c", ":", "m", ":", "N", ",", "s", ":", "G", "16 columns",
"1 columns", "1 columns", "40 columns"), file = c("'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'", "'Example.txt'",
"'Example.txt'", "'Example.txt'", "'Example.txt'")), row.names = c(NA,
-28L), class = c("tbl_df", "tbl", "data.frame")), spec = structure(list(
cols = list(X1 = structure(list(), class = c("collector_character",
"collector")), X2 = structure(list(), class = c("collector_character",
"collector")), X3 = structure(list(), class = c("collector_character",
"collector")), X4 = structure(list(), class = c("collector_character",
"collector")), X5 = structure(list(), class = c("collector_character",
"collector")), X6 = structure(list(), class = c("collector_character",
"collector")), X7 = structure(list(), class = c("collector_character",
"collector")), X8 = structure(list(), class = c("collector_character",
"collector")), X9 = structure(list(), class = c("collector_character",
"collector")), X10 = structure(list(), class = c("collector_character",
"collector")), X11 = structure(list(), class = c("collector_character",
"collector")), X12 = structure(list(), class = c("collector_character",
"collector")), X13 = structure(list(), class = c("collector_character",
"collector")), X14 = structure(list(), class = c("collector_character",
"collector")), X15 = structure(list(), class = c("collector_character",
"collector")), X16 = structure(list(), class = c("collector_character",
"collector")), X17 = structure(list(), class = c("collector_character",
"collector")), X18 = structure(list(), class = c("collector_character",
"collector")), X19 = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 0), class = "col_spec"))
对于将不带日期的行追加到上一行/行有什么建议吗?
【问题讨论】:
-
以原始状态而不是读取后的状态查看文件的前几行可能会有所帮助。
-
谢谢!我用一些原始文本更新了问题
标签: r tidyverse data-cleaning