【问题标题】:Split a String without removing the delimiter in Swift拆分字符串而不删除 Swift 中的分隔符
【发布时间】:2021-12-07 12:42:54
【问题描述】:

这可能是重复的。我在 Swift 中找不到答案,所以我不确定。

componentsSeparatedByCharactersInSet 删除分隔符。如果您仅用一个可能的字符分隔,则很容易将其添加回来。但是如果你有一套呢?

还有其他的分割方法吗?

【问题讨论】:

  • 是的,但是如果集合看起来像这样" '?!:;" 那会产生有趣的字符串,但不是我真正需要的。 => 响应“如果你只是连接分隔符”

标签: swift


【解决方案1】:

Swift 3 和 4 版本

extension Collection {
    func splitAt(isSplit: (Iterator.Element) throws -> Bool) rethrows -> [SubSequence] {
        var p = self.startIndex
        var result:[SubSequence] = try self.indices.flatMap {
            i in
            guard try isSplit(self[i]) else {
                return nil
            }
            defer {
                p = self.index(after: i)
            }
            return self[p...i]
        }
        if p != self.endIndex {
            result.append(suffix(from: p))
        }
        return result
    }
}

感谢 Oisdk 让我思考。

【讨论】:

  • Swift 4.1 需要使用compactMap 而不是flatMap
【解决方案2】:

这种方法适用于CollectionTypes,而不是Strings,但它应该很容易适应:

extension CollectionType {
  func splitAt(@noescape isSplit: Generator.Element throws -> Bool) rethrows ->  [SubSequence] {
    var p = startIndex
    return try indices
      .filter { i in try isSplit(self[i]) }
      .map { i in
        defer { p = i }
        return self[p..<i]
      } + [suffixFrom(p)]
  }
}

extension CollectionType where Generator.Element : Equatable {
  func splitAt(splitter: Generator.Element) -> [SubSequence] {
    return splitAt { el in el == splitter }
  }
}

你可以这样使用它:

let sentence = "Hello, my name is oisdk. This should split: but only at punctuation!"

let puncSet = Set("!.,:".characters)

sentence
  .characters
  .splitAt(puncSet.contains)
  .map(String.init)

// ["Hello", ", my name is oisdk", ". This should split", ": but only at punctuation", "!"]

或者,这个版本,它使用一个 for 循环,并在 分隔符之后分割:

extension CollectionType {
  func splitAt(@noescape isSplit: Generator.Element throws -> Bool) rethrows ->  [SubSequence] {
    var p = startIndex
    var result: [SubSequence] = []
    for i in indices where try isSplit(self[i]) {
      result.append(self[p...i])
      p = i.successor()
    }
    if p != endIndex { result.append(suffixFrom(p)) }
    return result
  }
}


extension CollectionType where Generator.Element : Equatable {
  func splitAt(splitter: Generator.Element) -> [SubSequence] {
    return splitAt { el in el == splitter }
  }
}

let sentence = "Hello, my name is oisdk. This should split: but only at punctuation!"

let puncSet = Set("!.,:".characters)

sentence
  .characters
  .splitAt(puncSet.contains)
  .map(String.init)

// ["Hello,", " my name is oisdk.", " This should split:", " but only at punctuation!"]

或者,如果您想将最多的 Swift 功能集成到一个函数中(deferthrows、协议扩展、邪恶的 flatMapguard 和 Optionals):

extension CollectionType {
  func splitAt(@noescape isSplit: Generator.Element throws -> Bool) rethrows -> [SubSequence] {
    var p = startIndex
    var result: [SubSequence] = try indices.flatMap { i in
      guard try isSplit(self[i]) else { return nil }
      defer { p = i.successor() }
      return self[p...i]
    }
    if p != endIndex { result.append(suffixFrom(p)) }
    return result
  }
}

【讨论】:

  • 完美,特别喜欢最后一张;)
  • extension Collection { func split(at isSplit: (Element) throws -&gt; Bool) rethrows -&gt; [SubSequence] { var p = startIndex return try indices.compactMap { i in guard try isSplit(self[i]) else { return nil } defer { p = index(after: i) } return self[p...i] } + (p != endIndex ? [suffix(from: p)] : []) } }
【解决方案3】:

我来这里是为了寻找这个问题的答案。没有找到我要找的东西,最终通过反复调用 .split(...) 来构建它。这并不优雅,但您可以选择保留哪些分隔符,哪些不保留。可能有一种方法可以避免字符串 子字符串转换,有人知道吗?

var input = """
    {All those moments will be (lost in time)},
    like tears [in rain](. ([(Time to)] die))
"""
var separator: Character = "!"
var output: [String] = []
repeat {
    let tokens = input.split(
        maxSplits: 1,
        omittingEmptySubsequences: false,
        whereSeparator: {
            switch $0 {
                case "{", "}", "(", ")", "[", "]": // preserve
                    separator = $0; return true
                case " ", "\n", ",", ".":          // omit
                    separator = " "; return true
                default:
                    return false
            }
        }
    )
    if tokens[0] != ""  { 
        output.append(String(tokens[0])) 
    }
    guard tokens.count == 2 else { break }
    if separator != " " { 
        output.append(String(separator)) 
    }
    input = String(tokens[1])
} while true

for token in output { print("\(token)") }

在上述情况下,选择器不在实际集合中。我不需要那个,但如果你需要,只需做出这些声明,

let preservedDelimiters: Set<Character> = [ "{", "}", "(", ")", "[", "]" ]
let omittedDelimiters: Set<Character> = [ " ", "\n", ",", "." ]

并将 whereSeparator 函数替换为:

whereSeparator: {
    if preservedDelimiters.contains($0) {
        separator = $0
        return true
    } else if omittedDelimiters.contains($0) {
        separator = " "
        return true
    } else {
        return false
    }
}

【讨论】:

    猜你喜欢
    • 2016-11-14
    • 2019-02-01
    • 2023-04-08
    • 1970-01-01
    • 2020-10-14
    • 1970-01-01
    • 2013-05-10
    • 1970-01-01
    相关资源
    最近更新 更多