【问题标题】:PHP nearest string comparison [duplicate]PHP最近的字符串比较[重复]
【发布时间】:2013-01-03 11:37:33
【问题描述】:

可能重复:
String similarity in PHP: levenshtein like function for long strings

我有我的主题字符串

$subj = "Director, My Company";

以及要比较的多个字符串的列表:

$str1 = "Foo bar";
$str2 = "Lorem Ipsum";
$str3 = "Director";

我想在这里实现的是找到与$subj 相关的最近的字符串。有可能吗?

【问题讨论】:

  • 请定义“最近”。
  • @Kagat-Kagat:“最近”是什么意思?
  • 嗨,在我的例子中,最接近的字符串是$str3,它是最接近$subj的字符串。
  • 搞定了,我不知道levenshtein()。谢谢!

标签: php string-comparison similarity


【解决方案1】:

levenshtein() 函数将满足您的期望。 Levenshtein 算法计算将某个字符串转换为另一个字符串所需的插入和替换操作的数量。结果称为edit distance。距离可用于根据您的要求比较字符串。

本示例来源于 PHP levenshtein() 函数的文档。

<?php

$input = 'Director, My Company';

// array of words to check against
$words  = array('Foo bar','Lorem Ispum','Director');

// no shortest distance found, yet
$shortest = -1;

// loop through words to find the closest
foreach ($words as $word) {

    // calculate the distance between the input word,
    // and the current word
    $lev = levenshtein($input, $word);

    // check for an exact match
    if ($lev == 0) {

        // closest word is this one (exact match)
        $closest = $word;
        $shortest = 0;

        // break out of the loop; we've found an exact match
        break;
    }

    // if this distance is less than the next found shortest
    // distance, OR if a next shortest word has not yet been found
    if ($lev <= $shortest || $shortest < 0) {
        // set the closest match, and shortest distance
        $closest  = $word;
        $shortest = $lev;
    }
}

echo "Input word: $input\n";
if ($shortest == 0) {
    echo "Exact match found: $closest\n";
} else {
    echo "Did you mean: $closest?\n";
}

脚本输出是

Input word: Director, My Company
Did you mean: Director?

祝你好运!

【讨论】:

    【解决方案2】:

    您可以使用http://php.net/manual/en/function.levenshtein.php 来确定两个字符串之间的距离。

    $subj = "Director, My Company";
    $str = array();
    $str[] = "Foo bar";
    $str[] = "Lorem Ipsum";
    $str[] = "Director";
    
    $minStr = "";
    $minDis = PHP_INT_MAX;
    for ($str as $curStr) {
      $dis = levenshtein($subj, $curStr);
      if ($dis < $minDis) {
        $minDis = $dis;
        $minStr = $curStr;
      }
    }
    echo($minStr);
    

    【讨论】:

    • 这更像是一个评论而不是一个答案。也请先涉及重复/类似的问题。
    猜你喜欢
    • 2011-10-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-04-15
    • 2018-07-09
    • 2017-06-27
    • 1970-01-01
    相关资源
    最近更新 更多