【发布时间】:2021-06-26 17:32:34
【问题描述】:
我有一个系统,我从第三方获得法语文本,但我很难让它可读。
String frenchReceipt = "RETIR�E"; // The original Text should be "RETIRÉE"
我尝试了所有可能的组合来使用 UTF-8 和 ISO-8859-1 的编码转换字符串
String frenchReceipt = "RETIR�E"; // The original Text should be "RETIRÉE"
byte[] b1 = new String(frenchReceipt.getBytes()).getBytes("UTF-8");
System.out.println(new String(b1)); // RETIR�E
byte[] b2 = new String(frenchReceipt.getBytes()).getBytes("ISO-8859-1");
System.out.println(new String(b2)); // RETIR�E
byte[] b3 = new String(frenchReceipt.getBytes(), "UTF-8").getBytes();
System.out.println(new String(b3)); // RETIR?E
byte[] b4 = new String(frenchReceipt.getBytes(), "UTF-8").getBytes();
System.out.println(new String(b4)); //RETIR?E
byte[] b5 = new String(frenchReceipt.getBytes(), "ISO-8859-1").getBytes("UTF-8");
System.out.println(new String(b5)); //RETIR�E
byte[] b6 = new String(frenchReceipt.getBytes(), "UTF-8").getBytes("ISO-8859-1");
System.out.println(new String(b6)); //RETIR?E
byte[] b7 = new String(frenchReceipt.getBytes(), "UTF-8").getBytes("UTF-8");
System.out.println(new String(b7)); //RETIR�E
byte[] b8 = new String(frenchReceipt.getBytes(), "ISO-8859-1").getBytes("ISO-8859-1");
System.out.println(new String(b8)); //RETIR�E
正如您所见,没有什么能解决问题。
请指教。
更新: 第三方合作伙伴确认以“ISO-8859-1”编码发送到我的应用程序的数据
【问题讨论】:
-
System.out 的控制台使用什么编码?
-
见stackoverflow.com/questions/6543548/…。字符�被编码为EF BF BD,答案中提到了。
-
@mayamar 默认文本文件编码为:“Cp1252”。但我也尝试将其更改为“UTF-8”和“ISO-8859-1”,但并没有解决问题。
标签: java encoding utf-8 iso-8859-1