String.IndexOf() 返回字符串的意外索引答案

【问题标题】：String.IndexOf() returns unexpected index of stringString.IndexOf() 返回字符串的意外索引
【发布时间】：2019-05-31 04:58:17
【问题描述】：

String.IndexOf() 方法没有按我的预期运行。

我预计它不会找到匹配项，因为确切的词 you 不在str 中。

string str = "I am your Friend";
int index = str.IndexOf("you",0,StringComparison.OrdinalIgnoreCase);
Console.WriteLine(index);

输出：5

我的预期结果是 -1，因为字符串不包含 you。

【问题讨论】：

您要的是字符串中字符序列“you”的位置，而不是word“you”的位置在字符串中。由于“your”以“you”开头，我们可以得出结论，字符序列“you”在字符串中。 documentation 声明 “报告指定 Unicode 字符或 string 在此实例中第一次出现的从零开始的索引。如果未找到该字符或字符串，则该方法返回 -1在这种情况下。”
@bolkay 虽然Contains() 也会断定“你”在字符串“我是你的朋友”中。
我是 your 朋友...在我看来它就在其中。如果您需要单词边界，请使用正则表达式，或者在搜索的左右空格中破解字符串（但可能会导致更多问题）
我怀疑您想要做的是string.Split 将字符串拆分为单词。然后string.Compare 而不是string.IndexOf。
如果您想保留大部分代码，可以搜索“you”而不是“you”（只需在“you”字符串前后添加空格）

标签： c# .net indexof

【解决方案1】：

您面临的问题是因为IndexOf 匹配单个字符，或更大字符串中的字符序列（搜索字符串）。因此“我是你的朋友”包含序列“你”。要仅匹配单词，您必须在单词级别考虑事物。

例如，您可以使用正则表达式'来匹配单词边界：

private static int IndexOfWord(string val, int startAt, string search)
{
    // escape the match expression in case it contains any characters meaningful
    // to regular expressions, and then create an expression with the \b boundary
    // characters
    var escapedMatch = string.Format(@"\b{0}\b", Regex.Escape(search));

    // create a case-sensitive regular expression object using the pattern
    var exp = new Regex(escapedMatch, RegexOptions.IgnoreCase);

    // perform the match from the start position
    var match = exp.Match(val, startAt);

    // if it's successful, return the match index
    if (match.Success)
    {
        return match.Index;
    }

    // if it's unsuccessful, return -1
    return -1;
}

// overload without startAt, for when you just want to start from the beginning
private static int IndexOfWord(string val, string search)
{
    return IndexOfWord(val, 0, search);
}

在您的示例中，您将尝试匹配 \byou\b，因为边界要求将不匹配 your。

Try it online

查看更多关于正则表达式中的单词边界here。

【讨论】：

【解决方案2】：

you 是I am your Friend 的有效子字符串。如果您想查找某个单词是否在字符串中，您可以parse the string with Split method.

char[] delimiterChars = { ' ', ',', '.', ':', '\t' };
string[] words = text.Split(delimiterChars);

然后查看数组内部。或者将其转换为更易于查找的数据结构。

如果您想不区分大小写搜索，可以使用以下代码：

char[] delimiterChars = { ' ', ',', '.', ':', '\t' };
string text = "I am your Friend";
// HasSet allows faster lookups in case of big strings
var words = text.Split(delimiterChars).ToHashSet(StringComparer.OrdinalIgnoreCase);
Console.WriteLine(words.Contains("you"));
Console.WriteLine(words.Contains("friend"));

错误
是的

按照以下代码创建字典-sn-p 可以快速检查所有单词的所有位置。

char[] delimiterChars = { ' ', ',', '.', ':', '\t' };
string text = "i am your friend. I Am Your Friend.";
var words = text.Split(delimiterChars);
var dict = new Dictionary<string, List<int>>(StringComparer.InvariantCultureIgnoreCase);
for (int i = 0; i < words.Length; ++i)
{
    if (dict.ContainsKey(words[i])) dict[words[i]].Add(i);
    else dict[words[i]] = new List<int>() { i };
}

Console.WriteLine("youR: ");
dict["youR"].ForEach(i => Console.WriteLine("\t{0}", i));
Console.WriteLine("friend");
dict["friend"].ForEach(i => Console.WriteLine("\t{0}", i));

【讨论】：