【发布时间】:2016-05-17 07:49:26
【问题描述】:
跑步
val animals = sc.parallelize(List("cat", "dog", "tiger", "lion", "gnu", "crocodile", "ant", "whale", "dolphin", "spider"), 3)
animals.foreachPartition(x => println(x.mkString(", ") + " are animals"))
在 spark-shell 中返回
lion, gnu, crocodile are animals
cat, dog, tiger are animals
ant, whale, dolphin, spider are animals
但如果我在 Jupyter 中使用 Apache Toree Spark 内核运行它,我将不会得到任何输出。我启动 Jupyter 的终端输出
animals: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[27] at parallelize at <console>:20
16/05/17 09:33:32 [WARN] o.a.t.k.p.v.s.KernelOutputStream - Suppressing empty output: ''
如何让 Jupyter 使用 foreach 将动物作为 spark-shell 输出?
【问题讨论】:
标签: scala apache-spark jupyter