如何在 PHP 中搜索另一个字符串中的字符串数组？答案

【问题标题】：How to search array of string in another string in PHP?如何在 PHP 中搜索另一个字符串中的字符串数组？
【发布时间】：2011-06-03 14:44:35
【问题描述】：

首先，我要通知的是，我需要的是 in_array PHP 函数的反转。

我需要搜索字符串中的所有数组项，如果找到，函数将返回true，否则返回false。

我需要这个问题的最快解决方案，当然这可以通过迭代数组和使用 strpos 函数来成功。

欢迎提出任何建议。

示例数据：

$string = 'Alice goes to school every day';

$searchWords = array('basket','school','tree');

返回 true

$string = 'Alice goes to school every day';

$searchWords = array('basket','cat','tree');

返回 false

【问题讨论】：

好吧，我认为你的速度不会比 strpos() 快。
不同意你，@Erisco。正则表达式会做到这一点并且速度更快。我只是不太了解。
在我发布初始评论之前没有检查 malko 的答案。
@afaolek，我相信这在很大程度上取决于搜索词的数量。对于小数字，我怀疑正则表达式会胜出，除非被搜索的字符串变得非常大并且搜索词的数量大于一个。

标签： php arrays string

【解决方案1】：

您应该尝试使用 preg_match：

if (preg_match('/' . implode('|', $searchWords) . '/', $string)) return true;

经过一些 cmets 后，一个正确转义的解决方案：

function contains($string, Array $search, $caseInsensitive = false) {
    $exp = '/'
        . implode('|', array_map('preg_quote', $search))
        . ($caseInsensitive ? '/i' : '/');
    return preg_match($exp, $string) ? true : false;
}

【讨论】：

$searchWords 应该正确转义
我不会投票给你，因为它是一个有效的答案，但我确实认为preg_match 在你拥有strstr 和stristr 之类的功能时毫无意义
@binaryLV：是的，这只是指向正确方向的一个快速想法，并且应该与问题中的示例代码一起使用，但更可靠的解决方案必须正确地转义 @RobertPitt、strstr 或stristr 不会一次测试多个字符串，或者我错过了什么？如果我们最终在循环中使用它，我认为 preg_match 会更有效吗？
遍历值（这可能意味着遍历大量值）可能非常低效。使用正则表达式可能会比这更快。但是，与往常一样，只有分析才能揭示更快的方法。
@RobertPitt 所以你是说使用 foreach 和 if 总是比匹配正则表达式更快？我真的很想看到一些基准，因为我看不到这种情况发生。

【解决方案2】：

function searchWords($string,$words)
{
    foreach($words as $word)
    {
        if(stristr($string," " . $word . " ")) //spaces either side to force a word
        {
            return true;
        }
    }
    return false;
}

用法：

$string = 'Alice goes to school every day';
$searchWords = array('basket','cat','tree');

if(searchWords($string,$searchWords))
{
     //matches
}

还要注意，函数stristr 用于使其不区分大小写

【讨论】：

您还可以指定$words 必须是一个数组，即function searchWords($string, array $words) { /* ... */ }
不，这仅在非常新的 PHP 版本中受支持，在我看来只会让人们感到困惑。
2005 年 11 月 24 日发布的 PHP 5.1.0 中引入了数组类型提示 - 我不会在 2011 年称其为“非常新的版本”。
嗯，这是一个很好的观点，我一定是对这些方面的其他东西感到困惑。抱歉。

【解决方案3】：

按照 malko 的示例，但正确转义了值。

function contains( $string, array $search ) {
    return 0 !== preg_match( 
        '/' . implode( '|', preg_quote( $search, '/' ) ) . '/', 
        $string 
    );
}

【讨论】：

这不是@malko 刚刚发布的内容吗：/
@Robert 是的，但是这个正确地转义了搜索到的字符串并且更可靠，我想制作一个“完美”的解决方案，我们可以添加第三个参数 $caseInsensitive=false 默认情况下添加一个 ' i' 到正则表达式的末尾，以允许以不区分大小写的方式进行搜索，即： '/' 。内爆（'|'，preg_quote（$search，'/'））。 '/'.($caseInsensitive?'i':'')
实际上这行不通，因为 preg_quote 不将数组作为参数处理，请参阅我的帖子，我已经稍微编辑了您的解决方案
是的，应该是implode( '|', array_map(function($e){return preg_quote( $e, '/' );},$search)

【解决方案4】：

如果可以使用空格分解字符串，则以下将起作用：

var_dump(array_intersect(explode(' ', $str), $searchWords) != null);

输出：对于您提供的 2 个示例：

bool(true)
bool(false)

更新：

如果字符串不能使用空格字符分解，则使用这样的代码在任何单词结尾字符上拆分字符串：

var_dump(array_intersect(preg_split('~\b~', $str), $searchWords) != null);

【讨论】：

在我的情况下，有时不可能使用空间爆炸，但谢谢，这是一个很酷的方法。
@WebolizeR 在这种情况下，您可以使用 var_dump(array_intersect(preg_split('~\b~', $str), $searchWords) != null); 将原始字符串拆分为单词字符的任何结尾，而不仅仅是空格。

【解决方案5】：

试试这个：

$string = 'Alice goes to school every day';
$words = split(" ", $string); 
$searchWords = array('basket','school','tree');

for($x = 0,$l = count($words); $x < $l;) {
        if(in_array($words[$x++], $searchWords)) {
                //....
        }
}

【讨论】：

【解决方案6】：

总是有关于什么更快的争论，所以我想我应该使用不同的方法运行一些测试。

测试运行：

strpos
preg_match 与 foreach 循环
preg_match 与正则表达式或
带有要爆炸的字符串的索引搜索
索引搜索为数组（字符串已展开）

运行两组测试。一个在大文本文档（114,350 字）上，一个在小文本文档（120 字）上。在每组中，所有测试都运行 100 次，然后取平均值。测试没有忽略大小写，这样做会使它们更快。搜索索引的测试已预先编入索引。我自己编写了索引代码，我确信它的效率较低，但是索引大文件需要 17.92 秒，而小文件需要 0.001 秒。

搜索的字词包括：gazerbeam（未在文档中找到）、legal（在文档中找到）和 target（在文档中未找到）。

以秒为单位完成单次测试的结果，按速度排序：

大文件：

0.0000455808639526（无爆炸索引）
0.0009979915618897（preg_match 使用正则表达式或）
0.0011657214164734 (strpos)
0.0023632574081421（preg_match 使用 foreach 循环）
0.0051533532142639（带爆炸的索引）

小文件

0.000003724098205566 (strpos)
0.000005958080291748（preg_match 使用正则表达式或）
0.000012607574462891（preg_match 使用 foreach 循环）
0.000021204948425293（无爆炸索引）
0.000060625076293945（带爆炸的索引）

请注意，对于小文件，strpos 比 preg_match（使用正则表达式或）快，但对于大文件则慢。其他因素，例如搜索词的数量当然会影响这一点。

使用的算法：

//strpos
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (strpos($str, $word)) break;
$strpos += microtime(true) - $t;

//preg_match
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (preg_match('/' . preg_quote($word) . '/', $str)) break;
$pregmatch += microtime(true) - $t;

//preg_match (regex or)
$str = file_get_contents('text.txt');
$orstr = preg_quote(implode('|', $search));
$t = microtime(true);
if preg_match('/' . $orstr . '/', $str) {};
$pregmatchor += microtime(true) - $t;

//index with explode
$str = file_get_contents('textindex.txt');
$t = microtime(true);
$ar = explode(" ", $str);
foreach ($search as $word) {
    $start = 0; 
    $end = count($ar);
    do {
        $diff = $end - $start;
        $pos = floor($diff / 2) + $start;
        $temp = $ar[$pos];
        if ($word < $temp) {
            $end = $pos;
        } elseif ($word > $temp) {
            $start = $pos + 1;
        } elseif ($temp == $word) {
            $found = 'true';
            break;
        }
    } while ($diff > 0);
}
$indexwith += microtime(true) - $t;

//index without explode (already in array)
$str = file_get_contents('textindex.txt');
$found = 'false';
$ar = explode(" ", $str);
$t = microtime(true);
foreach ($search as $word) {
    $start = 0; 
    $end = count($ar);
    do {
        $diff = $end - $start;
        $pos = floor($diff / 2) + $start;
        $temp = $ar[$pos];
        if ($word < $temp) {
            $end = $pos;
        } elseif ($word > $temp) {
            $start = $pos + 1;
        } elseif ($temp == $word) {
            $found = 'true';
            break;
        }
    } while ($diff > 0);
}
$indexwithout += microtime(true) - $t;

【讨论】：

您忘记通过preg_quote() 传递$word。您也没有在没有foreach 循环的情况下测试preg_match()（通过将引用的“单词”列表传递为word1|word2|word3）。
好点。我已采纳您的建议并更新了上述结果。
$orstr = preg_quote(implode('|', $search)); 看起来不对。应该引用每个单词而不是整个模式。将其替换为$orstr = implode('|', array_map('preg_quote', $search)); $orstr = str_replace('/', '\\/', $orstr);。转义分隔符需要第二次操作。用不同的词组（例如，用 2 个词和用 5 个词）进行一些测试也是值得的。

【解决方案7】：

下面打印从字符串中的数组中找到的元素数量的频率

function inString($str, $arr, $matches=false)
    {
        $str = explode(" ", $str);
        $c = 0;
        for($i = 0; $i<count($str); $i++)
        {
            if(in_array($str[$i], $arr) )
            {$c++;if($matches == false)break;}
        }
        return $c;
    }

【讨论】：

【解决方案8】：

以下链接将为您提供帮助：只需要根据需要进行自定义。

Check if array element exists in string

定制：

function result_arrayInString($prdterms,208){
  if(arrayInString($prdterms,208)){
      return true;
  }else{
     return false;
  }
}

这可能对你有帮助。

【讨论】：