【问题标题】:R Read.table with multiple words in a column [duplicate]R Read.table,一列中有多个单词[重复]
【发布时间】:2016-02-24 09:59:04
【问题描述】:

我有一个这种类型的日志文件要在 R 中处理:

2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Start entimICE Application Command Line Parameters ******
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Config-File: E:/Program Files (x86)/conf/storages.dsconfig
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Datasource: datasource
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Application: App
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Ignore : false
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Plugin: com.plug
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Logging: E:/Program Files (x86)/conf/log4j.properties
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** End Application Command Line Parameters ******
2015-11-23 11:51:02,129  INFO               BaseRuntime - Runtime created in mode: RichClient

我试图把它放在一个带有读取表的数据框中,但它把我的每个单词放在一个列中,我想要一个有 5 列的数据框:

date        time           type  element              text
2015-11-23  11:25::02,082  info  FrameworkAplication  - ****** Start entimICE Application Command Line Parameters ******

问题是我的字段分隔符是一个空格以及我不希望在不同字段中使用的单词分隔符

是否可以通过 read.table 或扫描,或者我应该做我自己的功能?

谢谢,

【问题讨论】:

  • 是的,但是最后一个术语的长度可变,这也会导致问题,我没有看到任何可以在这种情况下提供帮助的论点

标签: r data-mining logfile


【解决方案1】:

@ma33kael 您是否尝试过重复的解决方案? 因为它按预期工作

library(readr)
a <- read_fwf(text, fwf_widths(c(10,13,6,1)))

给你:

          X1           X2   X3                                                                                       X4
1 2015-11-23 11:51:02,082 INFO  FrameworkApplication - ****** Start entimICE Application Command Line Parameters ******
2 2015-11-23 11:51:02,082 INFO FrameworkApplication - ****** Config-File: E:/Program Files (x86)/conf/storages.dsconfig
3 2015-11-23 11:51:02,082 INFO                                     FrameworkApplication - ****** Datasource: datasource
4 2015-11-23 11:51:02,082 INFO                                           FrameworkApplication - ****** Application: App
5 2015-11-23 11:51:02,082 INFO                                             FrameworkApplication - ****** Ignore : false
6 2015-11-23 11:51:02,082 INFO                                           FrameworkApplication - ****** Plugin: com.plug
7 2015-11-23 11:51:02,082 INFO      FrameworkApplication - ****** Logging: E:/Program Files (x86)/conf/log4j.properties
8 2015-11-23 11:51:02,082 INFO             FrameworkApplication - ****** End Application Command Line Parameters ******
9 2015-11-23 11:51:02,129 INFO                                        BaseRuntime - Runtime created in mode: RichClient

数据:

text <- "2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Start entimICE Application Command Line Parameters ******
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Config-File: E:/Program Files (x86)/conf/storages.dsconfig
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Datasource: datasource
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Application: App
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Ignore : false
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Plugin: com.plug
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** Logging: E:/Program Files (x86)/conf/log4j.properties
2015-11-23 11:51:02,082  INFO      FrameworkApplication - ****** End Application Command Line Parameters ******
2015-11-23 11:51:02,129  INFO               BaseRuntime - Runtime created in mode: RichClient"

【讨论】:

  • 是的,但是没有注意到宽度设置为 1 的可能性,非常感谢
猜你喜欢
  • 2011-08-21
  • 2016-03-01
  • 2014-04-30
  • 1970-01-01
  • 2021-05-09
  • 1970-01-01
  • 2015-08-15
  • 2017-12-20
  • 1970-01-01
相关资源
最近更新 更多