字符串数组中的子字符串数组答案

【问题标题】：array of substrings in array of strings字符串数组中的子字符串数组
【发布时间】：2019-06-16 09:29:30
【问题描述】：

我有两个字符串数组。一个数组中的字符串可能是另一个数组中字符串的子集。我需要找出一个数组中的所有字符串是另一个数组中字符串的子字符串

例子：

arr1 = ["firestorm", "peanut", "earthworm"]
arr2 = ["fire", "tree", "worm", "rest"]

结果：

res = ["fire","worm", "rest"]

下面提到了我的解决方案。但这需要很多时间。我必须处理数千个单词。

解决方案：

res =[]
arr1.each do |word1|
  arr2.each do |word2|
   if word1.include? word2
     res << word2
   end
  end
end

请建议我更快的方法来做到这一点

【问题讨论】：

【解决方案1】：

很遗憾，我们不知道您的解决方案。

但是 Array 比 String 占用更多的内存空间。所以你可以转换它。

arr1 = ["firestorm", "peanut", "earthworm"]
arr2 = ["fire", "tree", "worm", "rest"]

arr1 = arr1.join(',')

然后

res = arr2.select { |word| arr1.include?(word) } #=> ["fire", "worm", "rest"]

或

res = arr2.select { |word| arr1.match?(word) } #=> ["fire", "worm", "rest"]

或

res = arr2.select { |word| arr1.match(word) } #=> ["fire", "worm", "rest"]

【讨论】：

【解决方案2】：

据我所知，由于术语重叠，您需要暴力破解：

def matched(find, list)
  list.flat_map { |e| find.flat_map { |f| e.scan(f) } }.uniq
end

在实践中：

matched(%w[ fire tree worm rest ], %w[ firestorm peanut earthworm ])
# => ["fire", "rest", "worm"]

这里%w 被用作表达列表的更快捷方式。

这是使用scan 和flat_map 的近似值：

def matched(find, list)
  rx = Regexp.union(find)

  list.flat_map { |e| e.scan(rx) }.uniq
end

在使用Rexexp.union 的地方，您可以创建一个与单个测试相比运行速度相当快的正则表达式。

不准确的地方：

matched(%w[ fire tree worm rest ], %w[ firestorm peanut earthworm ])
# => ["fire", "worm"]

【讨论】：