【问题标题】:Regex split extended CSV notation正则表达式拆分扩展 CSV 表示法
【发布时间】:2012-08-15 20:17:09
【问题描述】:

我有一个自定义传输格式,可以将数据打包成以下格式

[a:000,"name","field","field","field"]

我正在尝试拆分各个行以获取左括号之后的第一个字符和所有 CSV 值。 a、000、“名称”、“字段”、“字段”等...

我拼凑起来

[^?,:\[\]]

这会将所有单个字符拆分出来,而不是冒号/逗号分隔的字段。 我知道这不会在引号中容纳逗号。所以这显然是垃圾!

嵌入逗号并不是真正的大问题,因为我们控制着两端的数据,所以我可以避开它们。

感谢您的任何见解!

【问题讨论】:

  • 你使用什么编程语言?
  • 这是使用 NSRegularExpression 的 Objective C

标签: regex csv data-transfer


【解决方案1】:

与其尝试拆分多个字符并忽略其中一些,不如尝试匹配您想要匹配的任何字符。由于您没有指定实现语言,因此我将其发布为 Perl,但您可以将其应用于任何支持后向和前瞻的风格。

while ($subject =~ m/(\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))/g) {
    # matched text = $&
}

说明:

# (\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))
# 
# Match the regular expression below and capture its match into backreference number 1 «(\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))»
# Match either the regular expression below (attempting the next alternative only if this one fails) «\w+(?=:)»
# Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=:)»
# Match the character “:” literally «:»
# Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?<=:)\d+»
# Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=:)»
# Match the character “:” literally «:»
# Match a single digit 0..9 «\d+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Or match regular expression number 3 below (the entire group fails if this one fails to match) «(?<=,")[^"]*?(?=")»
# Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=,")»
# Match the characters “,"” literally «,"»
# Match any character that is NOT a “"” «[^"]*?»
# Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=")»
# Match the character “"” literally «"»

See it working.

【讨论】:

  • 我喜欢带有 18 行文档的单行正则表达式!
  • 太棒了,非常感谢。这是针对 Objective C 的。如果有任何问题,我会更新!
  • 有没有人对如何修改 Perl 正则表达式以在 iOS 中使用有任何建议,我对 Perl 中的正则表达式不太熟悉,所以在尝试了一些排列之后,我仍然在敲我的头!
  • 回答我自己的查询:由于 Obj C 不使用内联正则表达式控制参数“/g”等...这只是删除它们并转义所有斜杠的情况。 NSString * const REGEX_CSV_SPLIT = @"(\\w+(?=:)|(?
【解决方案2】:

您当然可以通过正则表达式来执行此操作,但合适的工具很可能是 CSV 解析器。你可以试试 Dave DeLong 为 Objective C 编写的这个:

https://github.com/davedelong/CHCSVParser

【讨论】:

  • 这似乎更适合作为评论,因为它只包含一个外部链接。
  • 是的,但我无法添加 cmets,因为我缺乏声誉。
猜你喜欢
  • 2013-08-11
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-09-26
  • 1970-01-01
  • 1970-01-01
  • 2022-01-25
  • 2021-05-20
相关资源
最近更新 更多