将迭代器转换为重复块的迭代器答案

【问题标题】：Transforming an iterator into an iterator of chunks of duplicates将迭代器转换为重复块的迭代器
【发布时间】：2020-05-18 09:52:50
【问题描述】：

假设我正在编写一个函数 foo: Iterator[A] => Iterator[List[A]] 将给定的迭代器转换为重复块的迭代器：

def foo[T](it: Iterator[A]): Iterator[List[A]] = ???
foo("abbbcbbe".iterator).toList.map(_.mkString) // List("a", "bbb", "c", "bb", "e")

为了实现foo，我想重用函数splitDupes: Iterator[A] => (List[A], Iterator[A])，它将迭代器拆分为带有重复的前缀，其余部分（非常感谢Kolmar提出的here）

def splitDupes[A](it: Iterator[A]): (List[A], Iterator[A]) = {
  if (it.isEmpty) {
    (Nil, Iterator.empty)
  } else {
    val head = it.next
    val (dupes, rest) = it.span(_ == head)
    (head +: dupes.toList, rest)
  }
}

现在我像这样使用splitDupes 写foo：

def foo[A](it: Iterator[A]): Iterator[List[A]] = {
   if (it.isEmpty) {
     Iterator.empty
   } else {
     val (xs, ys) = Iterator.iterate(splitDupes(it))(x => splitDupes(x._2)).span(_._2.nonEmpty)
     (if (ys.hasNext) xs ++ Iterator(ys.next) else xs).map(_._1)
   }
}

这个实现看起来很有效，但看起来很复杂和笨拙。
您将如何改进上面的 foo 实现？

【问题讨论】：

标签： scala iterator

【解决方案1】：

你可以这样做：

def foo[A](it: Iterator[A]): Iterator[List[A]] = {
  Iterator.iterate(splitDupes(it))(x => splitDupes(x._2))
    .map(_._1)
    .takeWhile(_.nonEmpty)
}

空箱已在splitDupes 中处理。您可以安全地继续调用splitDupes，直到它遇到这个空的情况（即开始在第一个元组元素中返回Nil）。

这在所有情况下都可以正常工作：

scala> foo("abbbcbbe".iterator).toList.map(_.mkString)
res1: List[String] = List(a, bbb, c, bb, e)

scala> foo("".iterator).toList.map(_.mkString)
res2: List[String] = List()

scala> foo("a".iterator).toList.map(_.mkString)
res3: List[String] = List(a) 

scala> foo("aaa".iterator).toList.map(_.mkString)
res4: List[String] = List(aaa)

scala> foo("abc".iterator).toList.map(_.mkString)
res5: List[String] = List(a, b, c)

【讨论】：