【问题标题】:How to insert special characters taken from a string into another string?如何将一个字符串中的特殊字符插入另一个字符串?
【发布时间】:2012-08-14 12:48:17
【问题描述】:

我有一个字符串,

    string1 = "Sri Lanka National Chess Championship this year and represented Sri Lanka at represented Sri Lanka Universities at the World University Chess Championships."

我还有另一个名为“string2”的字符串,它只有被“<NOUN> and </NOUN>”标签包围的字符串,并用空格分隔。

string2 = "<NOUN>Sri Lanka National Chess Championship</NOUN> <NOUN>Sri Lanka</NOUN> <NOUN>Sri Lanka</NOUN> <NOUN>World University Chess</NOUN>"

注意第二个字符串可以有任何名词标记词(基于'string1',例如:如果string1有3个名词,string2将有相同的3个名词,被名词标签包围)
我想要将标签添加到'string1'并使string1如下,

string1 = "<NOUN>Sri Lanka National Chess Championship</NOUN> this year and represented <NOUN>Sri Lanka</NOUN> at represented <NOUN>Sri Lanka</NOUN> Universities at the <NOUN>World University Chess</NOUN> Championships."

我使用以下代码来做到这一点,

Pattern p = Pattern.compile("<NOUN>(.*?)</NOUN>");
    Matcher m = p.matcher(string2);
    while(m.find()) {
        string1= string1.replaceAll(m.group(1),m.group(0));
    } 

但它给了我以下输出,

<NOUN><NOUN><NOUN>Sri Lanka</NOUN></NOUN> National Chess Championship</NOUN> this year and represented <NOUN><NOUN>Sri Lanka</NOUN></NOUN> at represented <NOUN><NOUN>Sri Lanka</NOUN></NOUN> Universities at the <NOUN>World University Chess</NOUN> Championships.

谁能告诉我如何正确执行此操作?
或者请告诉我如何从给定的输出中获得所需的输出?

【问题讨论】:

    标签: java regex string pattern-matching


    【解决方案1】:

    而不是:

    string1= string1.replaceAll(m.group(1),m.group(0));
    

    使用:

    string1= string1.replaceAll("(?<!<NOUN>)("+m.group(1)+")(?!</NOUN>)",m.group(0));
    

    查看更多关于“向前看和向后看构造”here

    【讨论】:

      【解决方案2】:

      你的例子的问题是Sri Lanka National Chess Championship是一个名词而Sri Lanka,这个字符串的一部分也是一个名词。因此,您的匹配器正在多次替换字符串。

      您可以通过不替换已替换的字符串片段来解决此问题。我将每个匹配的字符串分成三个部分:之前、匹配字符串、之后。 保持断弦的顺序。 Vector 是一种非常方便的数据结构。

      import java.util.Vector;
      import java.util.regex.Matcher;
      import java.util.regex.Pattern;
      
      
      public class Check {
      
      static String print(Vector<String> parts) {
          String str = parts.elementAt(0);
      
          for(int i=1; i<parts.size(); i++) {
              str += parts.elementAt(i); 
              //System.out.print(i + " : " + parts.elementAt(i) + "\n");
          }
      
          return str;
      }
      
      public static void main(String args[]) {
          String string1;
          String string2;
          String expected;
      
          string1 = "Sri Lanka National Chess Championship this year and represented Sri Lanka at represented Sri Lanka Universities at the World University Chess Championships.";
          string2 = "<NOUN>Sri Lanka National Chess Championship</NOUN> <NOUN>Sri Lanka</NOUN> <NOUN>Sri Lanka</NOUN> <NOUN>World University Chess</NOUN>";
          expected = "<NOUN>Sri Lanka National Chess Championship</NOUN> this year and represented <NOUN>Sri Lanka</NOUN> at represented <NOUN>Sri Lanka</NOUN> Universities at the <NOUN>World University Chess</NOUN> Championships.";
      
      
          Pattern p = Pattern.compile("<NOUN>(.*?)</NOUN>");
          Matcher m = p.matcher(string2);
          Vector<String> parts = new Vector<String>();
          parts.add(string1);
      
          while(m.find()) {
              for(int i=0; i<parts.size(); i++) {
      
                  //search for used part
                  if(parts.elementAt(i).indexOf("<NOUN>")!=-1) {
                      continue;
                  }
      
                  // search for pattern
                  String cur = parts.elementAt(i);
                  int disp = cur.indexOf(m.group(1));
                  if(disp==-1) {
                      continue;
                  } else {
                      parts.remove(i);
                      Vector<String> newParts = new Vector<String>();
      
                      if(disp!=0) {
                          newParts.add(cur.substring(0, disp));
                      }
      
                      newParts.add(m.group(0));
      
                      if((disp+m.group(1).length())!=cur.length()) {
                          newParts.add(cur.substring(disp+m.group(1).length()));
                      }
      
                      if(i!=0) {
                          parts.addAll(i, newParts);
                      } else {
                          parts.addAll(newParts);
                      }
      
                      //System.out.print(print(parts) + "\n");
                  }           
              }
          }
      
          string1 = print(parts);
          if(!string1.equals(expected)) {
              System.out.println("Unexpected output !!");
          } else {
              System.out.println("Correct !!");
          }
      }
      

      };

      为方便起见,您可以将打印方法重命名为字符串化。

      【讨论】:

      • 是的,我能理解为什么会这样。 '通过不替换已经替换的字符串片段来解决这个问题'你能告诉我怎么做吗?
      • 我检查了上面的代码是否正确。希望它能解决问题。
      猜你喜欢
      • 2012-09-25
      • 2015-02-11
      • 2013-11-24
      • 1970-01-01
      • 2016-02-07
      • 1970-01-01
      • 2011-05-13
      • 1970-01-01
      相关资源
      最近更新 更多