【问题标题】:Count the number of Occurrences of a Word in a String计算一个单词在字符串中出现的次数
【发布时间】:2014-04-29 06:39:33
【问题描述】:

我是 Java 字符串的新手,问题是我想计算字符串中特定单词的出现次数。假设我的字符串是:

i have a male cat. the color of male cat is Black

现在我也不想拆分它,所以我想搜索“雄猫”这个词。它在我的字符串中出现了两次!

我正在尝试的是:

int c = 0;
for (int j = 0; j < text.length(); j++) {
    if (text.contains("male cat")) {
        c += 1;
    }
}

System.out.println("counter=" + c);

它给了我 46 个计数器值!那么解决办法是什么?

【问题讨论】:

  • 你能描述一下你认为这段代码是如何工作的(或者你希望它工作)吗?这将帮助我们更好地帮助您。
  • 如果您想在aaaa 中搜索aa,应该得到什么结果?是2 还是3
  • 我给出了一个示例字符串,因此输出应该是 2,因为雄猫在字符串中出现了 2 次​​span>
  • 不知道 Java,但取决于你想做什么,如果它有一个非正则表达式 find first string util,它可以让你在循环中每次指定起始位置(C++'string ' 类有这个),它应该更快。
  • 好的,我查过了。您只需要while((newndx=str.indexOf("male cat",oldndx))&gt;-1){found++;oldndx=newndx+8;}

标签: java regex string


【解决方案1】:

对于 scala 来说只有 1 行

def numTimesOccurrenced(text:String, word:String) =text.split(word).size-1

【讨论】:

    【解决方案2】:
    public int occurrencesOf(String word) {
        int length = text.length();
        int lenghtofWord = word.length();
        int lengthWithoutWord = text.replaceAll(word, "").length();
        return (length - lengthWithoutWord) / lenghtofWord ;
    }
    

    【讨论】:

      【解决方案3】:

      简单的解决方案在这里-

      以下代码使用 HashMap,因为它将维护键和值。所以这里的键是单词,值是计数(给定字符串中单词的出现)。

      public class WordOccurance 
      {
      
       public static void main(String[] args) 
       {
          HashMap<String, Integer> hm = new HashMap<>();
          String str = "avinash pande avinash pande avinash";
      
          //split the word with white space       
          String words[] = str.split(" ");
          for (String word : words) 
          {   
              //If already added/present in hashmap then increment the count by 1
              if(hm.containsKey(word))    
              {           
                  hm.put(word, hm.get(word)+1);
              }
              else //if not added earlier then add with count 1
              {
                  hm.put(word, 1);
              }
      
          }
          //Iterate over the hashmap
          Set<Entry<String, Integer>> entry =  hm.entrySet();
          for (Entry<String, Integer> entry2 : entry) 
          {
              System.out.println(entry2.getKey() + "      "+entry2.getValue());
          }
      }
      

      }

      【讨论】:

        【解决方案4】:

        公共类字数{

        public static void main(String[] args) {
            // TODO Auto-generated method stub
            String scentence = "This is a treeis isis is is is";
            String word = "is";
            int wordCount = 0;
            for(int i =0;i<scentence.length();i++){
                if(word.charAt(0) == scentence.charAt(i)){
                    if(i>0){
                        if(scentence.charAt(i-1) == ' '){
                            if(i+word.length()<scentence.length()){
                                if(scentence.charAt(i+word.length()) != ' '){
                                    continue;}
                                }
                            }
                        else{
                            continue;
                        }
                    }
                    int count = 1;
                    for(int j=1 ; j<word.length();j++){
                        i++;
                        if(word.charAt(j) != scentence.charAt(i)){
                            break;
                        }
                        else{
                            count++;
                        }
                    }
                    if(count == word.length()){
                        wordCount++;
                    }
        
                }
            }
            System.out.println("The word "+ word + " was repeated :" + wordCount);
        }
        

        }

        【讨论】:

        • 只是代码并不总是一目了然,你应该考虑在代码中添加 cmets 并解释它是如何工作的,这将有助于其他人理解它是如何工作的以及一个工作代码
        【解决方案5】:

        Java 8 版本。

        System.out.println(Pattern.compile("\\bmale cat")
                    .splitAsStream("i have a male cat. the color of male cat is Black")
                    .count()-1);
        

        【讨论】:

          【解决方案6】:

          这里是完整的例子,

          package com.test;
          
          import java.util.HashMap;
          import java.util.Iterator;
          import java.util.Map;
          
          public class WordsOccurances {
          
                public static void main(String[] args) {
          
                      String sentence = "Java can run on many different operating "
                          + "systems. This makes Java platform independent.";
          
                      String[] words = sentence.split(" ");
                      Map<String, Integer> wordsMap = new HashMap<String, Integer>();
          
                      for (int i = 0; i<words.length; i++ ) {
                          if (wordsMap.containsKey(words[i])) {
                              Integer value = wordsMap.get(words[i]);
                              wordsMap.put(words[i], value + 1);
                          } else {
                              wordsMap.put(words[i], 1);
                          }
                      }
          
                      /*Now iterate the HashMap to display the word with number 
                     of time occurance            */
          
                     Iterator it = wordsMap.entrySet().iterator();
                     while (it.hasNext()) {
                          Map.Entry<String, Integer> entryKeyValue = (Map.Entry<String, Integer>) it.next();
                          System.out.println("Word : "+entryKeyValue.getKey()+", Occurance : "
                                          +entryKeyValue.getValue()+" times");
                     }
               }
          }
          

          【讨论】:

            【解决方案7】:

            我这里有另一种方法:

            String description = "hello india hello india hello hello india hello";
            String textToBeCounted = "hello";
            
            // Split description using "hello", which will return 
            //string array of words other than hello
            String[] words = description.split("hello");
            
            // Get number of characters words other than "hello"
            int lengthOfNonMatchingWords = 0;
            for (String word : words) {
                lengthOfNonMatchingWords += word.length();
            }
            
            // Following code gets length of `description` - length of all non-matching
            // words and divide it by length of word to be counted
            System.out.println("Number of matching words are " + 
            (description.length() - lengthOfNonMatchingWords) / textToBeCounted.length());
            

            【讨论】:

              【解决方案8】:

              将需要统计的String替换为空字符串,然后用不带字符串的长度计算出现次数。

              public int occurrencesOf(String word)
                  {
                  int length = text.length();
                  int lenghtofWord = word.length();
                  int lengthWithoutWord = text.replace(word, "").length();
                  return (length - lengthWithoutWord) / lenghtofWord ;
                  }
              

              【讨论】:

              • 最好使用 replaceAll()
              【解决方案9】:

              公共类 TestWordCount {

              public static void main(String[] args) {
              
                  int count = numberOfOccurences("Alice", "Alice in wonderland. Alice & chinki are classmates. Chinki is better than Alice.occ");
                  System.out.println("count : "+count);
              
              }
              
              public static int numberOfOccurences(String findWord, String sentence) {
              
                  int length = sentence.length();
                  int lengthWithoutFindWord = sentence.replace(findWord, "").length();
                  return (length - lengthWithoutFindWord)/findWord.length();
              
              }
              

              }

              【讨论】:

              【解决方案10】:

              我们可以从很多方面来计算子串的出现:-

              public class Test1 {
              public static void main(String args[]) {
                  String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
                  count(st, 0, "a".length());
              
              }
              
              public static void count(String trim, int i, int length) {
                  if (trim.contains("a")) {
                      trim = trim.substring(trim.indexOf("a") + length);
                      count(trim, i + 1, length);
                  } else {
                      System.out.println(i);
                  }
              }
              
              public static void countMethod2() {
                  int index = 0, count = 0;
                  String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
                  String subString = "my".toLowerCase();
              
                  while (index != -1) {
                      index = inputString.indexOf(subString, index);
                      if (index != -1) {
                          count++;
                          index += subString.length();
                      }
                  }
                  System.out.print(count);
              }}
              

              【讨论】:

                【解决方案11】:

                子串的出现方式有很多种,其中两个主题是:-

                public class Test1 {
                public static void main(String args[]) {
                    String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
                    count(st, 0, "a".length());
                
                }
                
                public static void count(String trim, int i, int length) {
                    if (trim.contains("a")) {
                        trim = trim.substring(trim.indexOf("a") + length);
                        count(trim, i + 1, length);
                    } else {
                        System.out.println(i);
                    }
                }
                
                public static void countMethod2() {
                    int index = 0, count = 0;
                    String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
                    String subString = "my".toLowerCase();
                
                    while (index != -1) {
                        index = inputString.indexOf(subString, index);
                        if (index != -1) {
                            count++;
                            index += subString.length();
                        }
                    }
                    System.out.print(count);
                }}
                

                【讨论】:

                  【解决方案12】:

                  这会起作用

                  int word_count(String text,String key){
                     int count=0;
                     while(text.contains(key)){
                        count++;
                        text=text.substring(text.indexOf(key)+key.length());
                     }
                     return count;
                  }
                  

                  【讨论】:

                    【解决方案13】:

                    StringUtils in apache commons-lang 有 CountMatches 方法来计算一个字符串在另一个字符串中出现的次数。

                       String input = "i have a male cat. the color of male cat is Black";
                       int occurance = StringUtils.countMatches(input, "male cat");
                       System.out.println(occurance);
                    

                    【讨论】:

                      【解决方案14】:

                      为什么不递归?

                      public class CatchTheMaleCat  {
                          private static final String MALE_CAT = "male cat";
                          static int count = 0;
                          public static void main(String[] arg){
                              wordCount("i have a male cat. the color of male cat is Black");
                              System.out.println(count);
                          }
                      
                          private static boolean wordCount(String str){
                              if(str.contains(MALE_CAT)){
                                  count++;
                                  return wordCount(str.substring(str.indexOf(MALE_CAT)+MALE_CAT.length()));
                              }
                              else{
                                  return false;
                              }
                          }
                      }
                      

                      【讨论】:

                        【解决方案15】:

                        Java 8 版本:

                            public static long countNumberOfOccurrencesOfWordInString(String msg, String target) {
                            return Arrays.stream(msg.split("[ ,\\.]")).filter(s -> s.equals(target)).count();
                        }
                        

                        【讨论】:

                          【解决方案16】:

                          您可以使用以下代码:

                          String in = "i have a male cat. the color of male cat is Black";
                          int i = 0;
                          Pattern p = Pattern.compile("male cat");
                          Matcher m = p.matcher( in );
                          while (m.find()) {
                              i++;
                          }
                          System.out.println(i); // Prints 2
                          

                          Demo

                          它有什么作用?

                          匹配"male cat"

                          while(m.find())
                          

                          表示,在m 找到匹配项时执行循环内给出的任何操作。 我将i 的值增加了i++,所以很明显,这给出了一个字符串所拥有的male cat 的数量。

                          【讨论】:

                          • 干得好。现在 OP 有他不理解的代码,但至少它可以工作。
                          【解决方案17】:

                          这应该是一种更快的非正则表达式解决方案。
                          (注意 - 不是 Java 程序员)

                           String str = "i have a male cat. the color of male cat is Black";
                           int found  = 0;
                           int oldndx = 0;
                           int newndx = 0;
                          
                           while ( (newndx=str.indexOf("male cat", oldndx)) > -1 )
                           {
                               found++;
                               oldndx = newndx+8;
                           }
                          

                          【讨论】:

                            【解决方案18】:

                            这个static 方法确实返回一个字符串在另一个字符串上出现的次数。

                            /**
                             * Returns the number of appearances that a string have on another string.
                             * 
                             * @param source    a string to use as source of the match
                             * @param sentence  a string that is a substring of source
                             * @return the number of occurrences of sentence on source 
                             */
                            public static int numberOfOccurrences(String source, String sentence) {
                                int occurrences = 0;
                            
                                if (source.contains(sentence)) {
                                    int withSentenceLength    = source.length();
                                    int withoutSentenceLength = source.replace(sentence, "").length();
                                    occurrences = (withSentenceLength - withoutSentenceLength) / sentence.length();
                                }
                            
                                return occurrences;
                            }
                            

                            测试:

                            String source = "Hello World!";
                            numberOfOccurrences(source, "Hello World!");   // 1
                            numberOfOccurrences(source, "ello W");         // 1
                            numberOfOccurrences(source, "l");              // 3
                            numberOfOccurrences(source, "fun");            // 0
                            numberOfOccurrences(source, "Hello");          // 1
                            

                            顺便说一句,该方法可以写在一行中,很糟糕,但它也有效:)

                            public static int numberOfOccurrences(String source, String sentence) {
                                return (source.contains(sentence)) ? (source.length() - source.replace(sentence, "").length()) / sentence.length() : 0;
                            }
                            

                            【讨论】:

                              【解决方案19】:

                              如果你找到你正在搜索的字符串,你可以继续寻找那个字符串的长度(如果你在 aaaa 中搜索 aa,你会考虑 2 次)。

                              int c=0;
                              String found="male cat";
                               for(int j=0; j<text.length();j++){
                                   if(text.contains(found)){
                                       c+=1;
                                       j+=found.length()-1;
                                   }
                               }
                               System.out.println("counter="+c);
                              

                              【讨论】:

                                【解决方案20】:

                                使用 indexOf...

                                public static int count(String string, String substr) {
                                    int i;
                                    int last = 0;
                                    int count = 0;
                                    do {
                                        i = string.indexOf(substr, last);
                                        if (i != -1) count++;
                                        last = i+substr.length();
                                    } while(i != -1);
                                    return count;
                                }
                                
                                public static void main (String[] args ){
                                    System.out.println(count("i have a male cat. the color of male cat is Black", "male cat"));
                                }
                                

                                这将显示:2

                                count() 的另一个实现,仅用 1 行代码:

                                public static int count(String string, String substr) {
                                    return (string.length() - string.replaceAll(substr, "").length()) / substr.length() ;
                                }
                                

                                【讨论】:

                                  【解决方案21】:

                                  如果您只想要"male cat" 的计数,那么我会这样做:

                                  String str = "i have a male cat. the color of male cat is Black";
                                  int c = str.split("male cat").length - 1;
                                  System.out.println(c);
                                  

                                  如果您想确保"female cat" 不匹配,请在拆分正则表达式中使用\\b 字边界:

                                  int c = str.split("\\bmale cat\\b").length - 1;
                                  

                                  【讨论】:

                                  • 这真的很简单。在我的一次采访中,他们要求我写一两行,这是完美的匹配。再次感谢。
                                  • 我猜这在字符串以搜索词结尾的情况下不起作用。例如:“我有一只公猫。公猫的颜色”
                                  【解决方案22】:

                                  该字符串在循环时始终包含该字符串。你不想 ++ 因为它现在正在做的只是获取字符串的长度,如果它包含 ""male cat"

                                  你需要 indexOf() / substring()

                                  有点明白我在说什么?

                                  【讨论】:

                                    【解决方案23】:

                                    一旦您找到需要将其从正在处理的字符串中删除的术语,以便它不会再次解决相同的问题,请使用 indexOf()substring() ,您无需执行包含检查长度时间

                                    【讨论】:

                                    • 我没有得到你!
                                    • 可能不起作用,因为如果他搜索“男性男性”并且有“男性男性男性”字符串 - 是 1 个还是 2 个匹配项,他没有指定行为。
                                    猜你喜欢
                                    • 1970-01-01
                                    • 1970-01-01
                                    • 2015-09-14
                                    • 2011-02-07
                                    • 2021-01-09
                                    • 2011-11-26
                                    • 1970-01-01
                                    • 2016-06-28
                                    相关资源
                                    最近更新 更多