【问题标题】:String contents are same but equals method returns false字符串内容相同但equals方法返回false
【发布时间】:2013-04-25 18:10:44
【问题描述】:

我正在使用 StringEscapeUtils 来转义和取消转义 html。我有以下代码

import org.apache.commons.lang.StringEscapeUtils;

public class EscapeUtils {

    public static void main(String args[]) {

        String string = "    4-Spaces    ,\"Double Quote\", 'Single Quote', \\Back-Slash\\, /Forward Slash/ ";

        String escaped = StringEscapeUtils.escapeHtml(string);
        String myEscaped = escapeHtml(string);

        String unescaped = StringEscapeUtils.unescapeHtml(escaped);
        String myUnescaped = StringEscapeUtils.unescapeHtml(myEscaped);

        System.out.println("Real String: " + string);
        System.out.println();
        System.out.println("Escaped String: " + escaped);
        System.out.println("My Escaped String: " + myEscaped);
        System.out.println();
        System.out.println("Unescaped String: " + unescaped);
        System.out.println("My Unescaped String: " + myUnescaped);
        System.out.println();
        System.out.println("Comparison:");
        System.out.println("Real String == Unescaped String: " + string.equals(unescaped));
        System.out.println("Real String == My Unescaped String: " + string.equals(myUnescaped));
        System.out.println("Unescaped String == My Unescaped String: " + unescaped.equals(myUnescaped));

    }

    public static String escapeHtml(String s) {
        String escaped = "";
        if(null != s) {
            escaped = StringEscapeUtils.escapeHtml(s);
            escaped = escaped.replaceAll(" "," ");
            escaped = escaped.replaceAll("'","'");
            escaped = escaped.replaceAll("\\\\","\");
            escaped = escaped.replaceAll("/","/");
        }
        return escaped;
    }

}

输出:

Real String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Comparison:
Real String == Unescaped String: true
Real String == My Unescaped String: false
Unescaped String == My Unescaped String: false

escaped 真正的string 然后unescaped 它。但是myEsceped首先用相同的过程转义,然后用它们的html代码替换更多的html字符。 myUnescaped 实际上是 myEscaped 的转义,它的内容与真实字符串的内容相同。

输出显示真实的stringunescapedmyUnescaped 内容相同。但是,在比较部分,myUnescaped 不等于 stringunescaped

我还不明白这里到底发生了什么。谁能解释一下?

【问题讨论】:

  • 哦,我的头在旋转
  • 能否请您调试并检查字符串的字符数组以验证并请分享
  • 我在您的代码中看不到 Unescaped String == My Unescaped String: 行。你能在你的程序中添加这个比较的部分吗?
  • @Patashu 感谢您的指出。我已添加该行。

标签: java stringescapeutils


【解决方案1】:

这是由于在转义 HTML 时,您将 ' ' 替换为  

public static String escapeHtml(String s) {
        String escaped = "";
        if(null != s) {
            escaped = StringEscapeUtils.escapeHtml(s);
            escaped = escaped.replaceAll(" "," "); // HERE
            escaped = escaped.replaceAll("'","'");
            escaped = escaped.replaceAll("\\\\","\");
            escaped = escaped.replaceAll("/","/");
        }
        return escaped;
    }

虽然StringEscapeUtils.escapeHtml 不会转义' ',但下面是他们网站上的示例:

"bread" & "butter" 

变成

"bread" & "butter"

这意味着StringEscapeUtils.escapeHtml 保留空格

如果从 escapeHtml 中删除 escaped = escaped.replaceAll(" "," ");unescapedmyUnescaped 匹配!

【讨论】:

    【解决方案2】:

    Apurv Answer之后,我分析了字符串的字节数组。

    String:        32,  32,  32,  32,  52,  45,  83, 112,  97,  99, 101, 115,  32,  32,  32,  32,  44,  34,  68, 111, 117,  98, 108, 101,  32,  81, 117, 111, 116, 101,  34,  44,  32,  39,  83, 105, 110, 103, 108, 101,  32,  81, 117, 111, 116, 101,  39,  44,  32,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44,  32,  47,  70, 111, 114, 119,  97, 114, 100,  32,  83, 108,  97, 115, 104,  47,  32
    unescaped :    32,  32,  32,  32,  52,  45,  83, 112,  97,  99, 101, 115,  32,  32,  32,  32,  44,  34,  68, 111, 117,  98, 108, 101,  32,  81, 117, 111, 116, 101,  34,  44,  32,  39,  83, 105, 110, 103, 108, 101,  32,  81, 117, 111, 116, 101,  39,  44,  32,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44,  32,  47,  70, 111, 114, 119,  97, 114, 100,  32,  83, 108,  97, 115, 104,  47,  32
    myUnescaped:  -96, -96, -96, -96,  52,  45,  83, 112,  97,  99, 101, 115, -96, -96, -96, -96,  44,  34,  68, 111, 117,  98, 108, 101, -96,  81, 117, 111, 116, 101,  34,  44, -96,  39,  83, 105, 110, 103, 108, 101, -96,  81, 117, 111, 116, 101,  39,  44, -96,  92,  66,  97,  99, 107,  45,  83, 108,  97, 115, 104,  92,  44, -96,  47,  70, 111, 114, 119,  97, 114, 100, -96,  83, 108,  97, 115, 104,  47, -96
    

    我似乎在 myUnescaped,空格已转换为 ascii -96 而不是 32

    所以我写了一个unescapeHtml 方法如下。该方法先将&nbsp替换为空格,然后使用StringEscapeUtils对html进行转义。

    public static String unescapeHtml(String s) {
        String unescaped = "";
        if(null != s) {
            unescaped = s.replaceAll(" ", " ");
            unescaped = StringEscapeUtils.unescapeHtml(unescaped);
        }
        return unescaped;
    }
    

    然后我使用以下代码得到myUnescaped

    String myUnescaped = unescapeHtml(myEscaped);
    

    这给了我myUnescaped 字符串等于stringunescaped

    交替我用 替换了 。这不需要我写unescapeHtml mehod。更新的escapeHtml 方法代码如下。

    public static String escapeHtml(String s) {
        String escaped = "";
        if(null != s) {
            escaped = StringEscapeUtils.escapeHtml(s);
            escaped = escaped.replaceAll(" "," ");    //updated line 
            escaped = escaped.replaceAll("'","'");
            escaped = escaped.replaceAll("\\\\","\");
            escaped = escaped.replaceAll("/","/");
        }
        return escaped;
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-12-29
      • 2016-12-18
      • 2015-07-31
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多