如何将大写字母视为文本文件中的缩写答案

【问题标题】：How do I count capitals as abbreviations from a text file如何将大写字母视为文本文件中的缩写
【发布时间】：2013-11-13 21:41:46
【问题描述】：

所以我的程序应该读取一个包含推文帖子的文本文件（每行一条推文）。它应该输出主题标签（任何以#开头的单词）和名称标签（任何以@开头的单词）的数量，以及困难的部分：它应该检查appreviations（不以@或#开头的所有大写单词）；然后打印缩写以及它们的数量。例如; 输入是

OMG roommate @bob drank all the beer...#FML #ihatemondays
lost TV remote before superbowl #FML
Think @bieber is soo hawt...#marryme
seeing @linkinpark & @tswift in 2 weeks...OMG

输出应如下所示：

Analyzing post:
OMG roommate @bob drank all the beer...#FML #ihatemondays
Hash tag count: 2
Name tag count: 1
Acronyms: OMG 
For a total of 1 acronym(s).

这是我的代码：

import java.io.*; //defines FileNotFoundException
import java.util.Scanner; // import Scanner class

    public class TweetAnalyzer {
    public static void main (String [] args) throws FileNotFoundException{
    //variables
        String tweet;
        Scanner inputFile = new Scanner(new File("A3Q1-input.txt"));

        while (inputFile.hasNextLine())
        {
          tweet = inputFile.nextLine();
          System.out.println("Analyzing post: ");
          System.out.println("\t" + tweet);
          analyzeTweet(tweet);
        }


      }//close main 

      public static void analyzeTweet(String tweet){
        int hashtags = countCharacters(tweet, '#');
        int nametags = countCharacters(tweet, '@');
        System.out.println("Hash tag: " + hashtags);
        System.out.println("Name tag: " + nametags);
        Acronyms(tweet);

      }//close analyzeTweet

      public static int countCharacters(String tweet, char c)//char c represents both @ and # symbols
      {
        int characters = 0;
        char current;
        for(int i=0;i<tweet.length();i++)
        {
          current = tweet.charAt(i);
          if(current == c)
          {
            characters++;
          }
        }
        return characters;
      }

      public static boolean symbol(String tweet, int i) {
        boolean result = true;
        char c;
        if(i-1 >=0)
        {
          c = tweet.charAt(i - 1);
          if (c == '@' || c == '#') {
            result = false;
        }
        }//close if
        else
        {
         result = false;
        }
        return result;
      }

      public static void Acronyms (String tweet){
        char current;
        int capital = 0;
        int j = 0;
        String initials = "";


        for(int i = 0; i < tweet.length(); i++) {
          current = tweet.charAt(i);
          if(symbol(tweet, i) && current >= 'A' && current <= 'Z') {       
            initials += current;
            j = i + 1; 
            current = tweet.charAt(j);
            while(j < tweet.length() && current >= 'A' && current <= 'Z') {
              current = tweet.charAt(j);
              initials += current;
              j++;

            }
            capital++;
            i = j;
            initials += " ";
            }
          else {

            j = i + 1; 
            current = tweet.charAt(j);
            while(j < tweet.length() && current >= 'A' && current <= 'Z') {
              current = tweet.charAt(j);

              j++;

            }

            i = j;

        }
        }
         System.out.println(initials);
         System.out.println("For a total of " + capital + " acronym(s)");
    }//close Acronyms


      }//TweetAnalyzer

除缩写部分外，一切正常。这是我的输出：

Analyzing post: 
    OMG roommate @bob drank all the beer...#FML #ihatemondays
Hash tag: 2
Name tag: 1

For a total of 0 acronym(s)
Analyzing post: 
    lost TV remote before superbowl #FML
Hash tag: 1
Name tag: 0

For a total of 0 acronym(s)
Analyzing post: 
    Think @bieber is soo hawt...#marryme
Hash tag: 1
Name tag: 1

For a total of 0 acronym(s)
Analyzing post: 
    seeing @linkinpark & @tswift in 2 weeks...OMG
Hash tag: 0
Name tag: 2
OMG 
For a total of 1 acronym(s)

请帮助我修复缩写部分。谢谢

【问题讨论】：

标签： java drjava

【解决方案1】：

这样逐字逐句地阅读似乎更自然：

for (String word : tweet.split("\\s+")) {
    if (word.charAt(0) == '@') {
        names++;

    } else if (word.charAt(0) == '#') {
        hashtags++;

    } else if (word.toUpperCase().equals(word)) {
        abbrevs++;
    }
}

【讨论】：

空格之外还能有其他空格吗？
@tieTYT True.. 已编辑以允许正则表达式空格匹配
名称、标签、缩写是什么变量？字符、字符串还是整数？
@user2932716 它们是整数。这只是一种计算每个人有多少的方法。您应该将它们初始化为 0。

【解决方案2】：

这就是我要做的：我会在空白处分割推文，这样你就有一个单词列表。然后我会扔掉包含符号的单词。您可以为此使用StringUtils.isAlpha。现在，只需检查word.toUpperCase().equals(word)。如果是这样，那是一个没有符号的大写单词。你所说的首字母缩写词。

【讨论】：

【解决方案3】：

试试这个方法来计算首字母缩略词：

private static int countAcronyms(String tweet) {
    int acronyms = 0;
    String[] words = tweet.split(" ");

    for (String word : words) {
        if(word.matches("[A-Z]+"))
            acronyms++;
    }

    return acronyms;
}

【讨论】：

这适用于计算首字母缩写词，谢谢。但是，它不计算输入中的最后一行，因为在“OMG”之前有一个句点。其次，如何打印出由countAcronyms方法统计的缩写，每个缩写之间有一个空格？
尝试更改 split 方法中的正则表达式以将字符串正确拆分为单词。
要打印首字母缩略词列表，声明一个数组或列表变量并将给定的单词存储在其中。使方法返回此列表。现在，只要计算它们，您就可以打印每个元素：-)
私有静态列表 getAcronyms(String tweet) { int acronyms = 0; String[] words = tweet.split(" "); List 结果 = 新 ArrayList(); for (String word : words) { if(word.matches("[A-Z]+")) result.add(word); } 返回结果；}
很抱歉出现垃圾评论。我是用手机写的:-)

【解决方案4】：

使用StringTokenizer 分割类似这样的空格

StringTokenizer st = new StringTokenizer (yourString);
while(st.hasMoreTokens()) {
   String str = st.nextElement();
   if(str.toUpperCase().equals(str)) {
      abbrvCount++;
   }
}

希望这会有所帮助。

【讨论】：