【问题标题】:UnknownHostException Android utf-8 encoding URLUnknownHostException Android utf-8 编码 URL
【发布时间】:2016-06-16 15:08:24
【问题描述】:
String url = String.format("http://%s.jpg.to", URLEncoder.encode("свинья", "utf-8"));
new URL(url).openStream();
Document doc = Jsoup.connect(url).get();

我想在 URL 中阅读带有俄语符号的网页,但捕获异常(Android 4.1.1):

W/System.err: java.net.UnknownHostException: http://%D1%81%D0%B2%D0%B8%D0%BD%D1%8C%D1%8F.jpg.to
W/System.err:     at libcore.net.http.HttpConnection$Address.<init>(HttpConnection.java:283)
W/System.err:     at libcore.net.http.HttpConnection.connect(HttpConnection.java:128)
W/System.err:     at libcore.net.http.HttpEngine.openSocketConnection(HttpEngine.java:315)
W/System.err:     at libcore.net.http.HttpEngine.connect(HttpEngine.java:310)
W/System.err:     at libcore.net.http.HttpEngine.sendSocketRequest(HttpEngine.java:289)
W/System.err:     at libcore.net.http.HttpEngine.sendRequest(HttpEngine.java:239)
W/System.err:     at libcore.net.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:80)
W/System.err:     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
W/System.err:     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
W/System.err:     at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
W/System.err:     at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216)
W/System.err:     at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:63)
W/System.err:     at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:49)
W/System.err:     at android.os.AsyncTask$2.call(AsyncTask.java:287)
W/System.err:     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:305)
W/System.err:     at java.util.concurrent.FutureTask.run(FutureTask.java:137)
W/System.err:     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:230)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
W/System.err:     at java.lang.Thread.run(Thread.java:856)

但链接http://2.jpg.to/(例如)工作正常。我做错了什么?

【问题讨论】:

  • URLEncoder.encode("свинья", "utf-8") 打印在你身边的是什么?
  • %D1%81%D0%B2%D0%B8%D0%BD%D1%8C%D1%8F
  • 并且链接在浏览器中正常工作 http://%D1%81%D0%B2%D0%B8%D0%BD%D1%8C%D1%8F.jpg.to

标签: java android url unicode


【解决方案1】:
            String url = String.format("http://%s.jpg.to", IDN.toASCII("свинья"));

我们看到链接http://xn--b1ampn2ds.jpg.to

【讨论】:

    【解决方案2】:

    当您简单地将字符按原样放在 URL 上时会发生什么? 例如,尝试这样的事情:

    String host = "свинья";    
    //here we now do string-formatting and then call the convertUrlToPunycodeIfNeeded which uses IDN
    String url= convertUrlToPunycodeIfNeeded(String.format("http://%s.jpg.to", host));
    //then simply use the URL
    new URL(url).openStream();
    Document doc = Jsoup.connect(url).get();
    

    以下是显示如何在您的案例中使用 java.net.IDN 的代码:

        //The translation of characters to their Latin equivalent
       public static String convertUrlToPunycodeIfNeeded(String url) {
            if (!Charset.forName("US-ASCII").newEncoder().canEncode(url)) {
                if (url.toLowerCase().startsWith("http://")) {
                    url = "http://" + IDN.toASCII(url.substring(7));
                } else if (url.toLowerCase().startsWith("https://")) {
                    url = "https://" + IDN.toASCII(url.substring(8));
                } else {
                    url = IDN.toASCII(url);
                }
            }
            return url;
        }
    

    我找到了这个很好的例子here - 例子1

    【讨论】:

    • 我会在真机上试试这个...请等一下,2 分钟
    • 在真机上也有这个异常
    • 我对代码进行了一些更改 - 请您尝试一下,如果它有帮助或错误是否仍然存在,请告诉我。
    • 你可以简单地写成“utf-8”。顺便问一下,这是对字符的准确翻译吗:svin’ja
    • :// 符号被编码 =) java.net.MalformedURLException: 找不到协议:http%3A%2F%2F%D1%80.jpg.to
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-11-17
    • 1970-01-01
    • 2011-10-30
    • 2012-11-07
    相关资源
    最近更新 更多