【发布时间】:2019-02-11 15:35:20
【问题描述】:
我有一个包含以下内容的文本文件:
....
{"emojiCharts":{"emoji_icon":"\u2697","repost": 3, "doc": 3, "engagement": 1184, "reach": 6734, "impression": 44898}}
{"emojiCharts":{"emoji_icon":"\U0001f924","repost": 11, "doc": 11, "engagement": 83, "reach": 1047, "impression": 6981}}
....
一些表情符号是\uhhhh 格式,其中一些是\Uhhhhhhhh 格式。
是否存在任何对其进行编码/解码以显示表情符号的方法?因为如果文件仅包含 \Uhhhhhhhh 则一切正常。
到这个阶段我已经这样修改了文件:
insightData.decode("raw_unicode_escape").encode('utf-16', 'surrogatepass').decode('utf-16').encode("raw_unicode_escape").decode("latin_1")
要显示表情符号,我需要使用这个:
insightData.decode("raw_unicode_escape").encode('utf-16', 'surrogatepass').decode('utf-16')
但它显示错误:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2600' in position 30: ordinal not in range(128)
解决方案:
with open(OUTPUT, "r") as infileInsight:
insightData = infileInsight.read()\
.decode('raw_unicode_escape')
with open(OUTPUT, "w+") as outfileInsight:
outfileInsight.write(insightData.encode('utf-8'))
【问题讨论】:
-
UnicodeEncodeError 什么时候出现?在 Python 控制台中执行
print时?哪个python版本?哪个操作系统? -
@tzot 当我尝试写入文件时,python 2.7,WIN10
标签: python utf-8 decode encode emoji