【问题标题】:java indexof(String str) method complexity [duplicate]java indexof(String str)方法复杂度[重复]
【发布时间】:2012-09-26 23:10:05
【问题描述】:

可能重复:
What is the cost / complexity of a String.indexof() function call

java indexof(String str) 方法的复杂度是多少。我的意思是有像 KMP 这样的字符串匹配算法,它在线性时间内运行。我正在实现一个需要在一个非常大的字符串中搜索大子字符串的系统,所以我可以使用 java indexof(String str) 方法还是应该实现 KMP。

【问题讨论】:

  • 你应该只使用方便的方式,除非你有硬数据表明它不会削减它。在互联网上提出定义模糊的假设性问题(“真正大”有多长?15,000 个字符?5 GB?)是没有用的。

标签: java algorithm


【解决方案1】:

Java 的implementationindexOf 的复杂度是O(m*n),其中nm 分别是搜索字符串和模式的长度。

您可以做些什么来提高复杂性,例如使用Boyer-More 算法来智能地跳过比较与模式不匹配的字符串的逻辑部分。

【讨论】:

  • -1,您的回答是错误的/具有误导性。有两个长度需要担心——你正在搜索的字符串和你正在搜索的字符串——算法是 O(mn)。
  • @LouisWasserman,对,感谢您指出这一点。
  • O(m*n) 是最坏的情况,不是典型的。典型值为O(n)。当您说O(m*n) 时,您完全忽略了匹配模式与当前位置的短路,一旦模式不匹配,这种短路就会停止。这很少会超过模式的第三个字符。 indexOf() 甚至在进入模式匹配逻辑之前进行了特定的优化以查找第一个字符。摊销后,可能类似于O(1.1 * n),所以O(n)
【解决方案2】:

java indexOf 函数复杂度为 O(n*m) 其中 n 是文本长度,m 是模式长度
这是indexOf原代码

   /**
     * Returns the index within this string of the first occurrence of the
     * specified substring. The integer returned is the smallest value
     * <i>k</i> such that:
     * <blockquote><pre>
     * this.startsWith(str, <i>k</i>)
     * </pre></blockquote>
     * is <code>true</code>.
     *
     * @param   str   any string.
     * @return  if the string argument occurs as a substring within this
     *          object, then the index of the first character of the first
     *          such substring is returned; if it does not occur as a
     *          substring, <code>-1</code> is returned.
     */
    public int indexOf(String str) {
    return indexOf(str, 0);
    }

    /**
     * Returns the index within this string of the first occurrence of the
     * specified substring, starting at the specified index.  The integer
     * returned is the smallest value <tt>k</tt> for which:
     * <blockquote><pre>
     *     k &gt;= Math.min(fromIndex, this.length()) && this.startsWith(str, k)
     * </pre></blockquote>
     * If no such value of <i>k</i> exists, then -1 is returned.
     *
     * @param   str         the substring for which to search.
     * @param   fromIndex   the index from which to start the search.
     * @return  the index within this string of the first occurrence of the
     *          specified substring, starting at the specified index.
     */
    public int indexOf(String str, int fromIndex) {
        return indexOf(value, offset, count,
                       str.value, str.offset, str.count, fromIndex);
    }

    /**
     * Code shared by String and StringBuffer to do searches. The
     * source is the character array being searched, and the target
     * is the string being searched for.
     *
     * @param   source       the characters being searched.
     * @param   sourceOffset offset of the source string.
     * @param   sourceCount  count of the source string.
     * @param   target       the characters being searched for.
     * @param   targetOffset offset of the target string.
     * @param   targetCount  count of the target string.
     * @param   fromIndex    the index to begin searching from.
     */
    static int indexOf(char[] source, int sourceOffset, int sourceCount,
                       char[] target, int targetOffset, int targetCount,
                       int fromIndex) {
    if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
    }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
    if (targetCount == 0) {
        return fromIndex;
    }

        char first  = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j] ==
                         target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }

你可以简单地实现KMP算法而不使用indexOf这样

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.util.Scanner;


public class Main{
    int failure[];
    int i,j;
    BufferedReader in=new BufferedReader(new InputStreamReader(System.in));
    PrintWriter out=new PrintWriter(System.out);
    String pat="",str="";
    public Main(){
            try{

            int patLength=Integer.parseInt(in.readLine());
            pat=in.readLine();
            str=in.readLine();
            fillFailure(pat,patLength);
            match(str,pat,str.length(),patLength);
            out.println();
            failure=null;}catch(Exception e){}

        out.flush();
    }
    public void fillFailure(String pat,int patLen){
        failure=new int[patLen];
        failure[0]=-1;
        for(i=1;i<patLen;i++){
            j=failure[i-1];
            while(j>=0&&pat.charAt(j+1)!=pat.charAt(i))
                j=failure[j];
            if(pat.charAt(j+1)==pat.charAt(i))
                failure[i]=j+1;
            else
                failure[i]=-1;
        }
    }
    public void match(String str,String pat,int strLen,int patLen){
        i=0;
        j=0;
        while(i<strLen){
            if(str.charAt(i)==pat.charAt(j)){
                i++;
                j++;
                if(j==patLen){
                    out.println(i-j);
                    j=failure[j-1]+1;
                }
            } else if (j==0){
                    i++;
            }else{
                j=failure[j-1]+1;
            }

        }
    }
    public static void main(String[] args) {
        new Main();
    }
}

【讨论】:

    猜你喜欢
    • 2015-02-16
    • 2020-12-24
    • 2022-10-08
    • 2020-09-21
    • 2018-09-14
    • 1970-01-01
    • 2015-05-31
    • 2020-09-11
    • 1970-01-01
    相关资源
    最近更新 更多