【发布时间】:2020-03-15 19:22:26
【问题描述】:
我想在 Map[String,List[scala.util.matching.Regex]] 与数据框列之间执行查找。如果任何List[scala.util.matching.Regex] 与数据框列值匹配,则它应该从Map[String,List[scala.util.matching.Regex]] 返回key
Map[String,List[scala.util.matching.Regex]] = Map(m1 -> List(rule1, rule2), m2 -> List(rule3), m3 -> List(rule6)))
我想遍历正则表达式列表并与数据框列值匹配。如果正则表达式匹配可以并行而不是顺序完成会更好
dataframe
+------------------------+
|desc |
+------------------------+
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule2|
|STRING MATCHES SSS rule2|
|STRING MATCHES SSS rule3|
|STRING MATCHES SSS rule3|
|STRING MATCHES SSS rule6|
+------------------------+
O/P:
+-------------------+------------------------+
|merchant |desc |
+-------------------+------------------------+
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule2|
|m1 |STRING MATCHES SSS rule2|
|m2 |STRING MATCHES SSS rule3|
|m2 |STRING MATCHES SSS rule3|
|m3 |STRING MATCHES SSS rule6|
+-------------------+------------------------+
【问题讨论】:
-
请提供样本数据和预期输出以清楚地理解问题
-
@Nikk,已更新数据和预期的 O/P
-
谢谢,我会尽快检查并为您提供解决方案
-
它解决了你的问题吗?
标签: scala apache-spark