在python中解码unicode字符串

【问题标题】：Decode unicode string in python在python中解码unicode字符串
【发布时间】：2014-03-15 01:10:20
【问题描述】：

我想解码以下字符串：

t\u028c\u02c8m\u0251\u0279o\u028a\u032f

它应该是来自 http://rhymebrain.com/talk?function=getWordInfo&word=tomorrow 的 JSON 字符串中给出的“明天”的 IPA

我的理解是应该是这样的：

x = 't\u028c\u02c8m\u0251\u0279o\u028a\u032f'
print x.decode()

我已经尝试了来自here、here、here 和here 的解决方案（以及其他几个或多或少适用的解决方案），以及其部分的几种排列，但我无法理解去工作。

谢谢

【问题讨论】：

标签： python unicode

【解决方案1】：

你需要在你的字符串之前加上一个u（在 Python 2.x 中，你似乎正在使用它）来表明这是一个 unicode 字符串：

>>> x = u't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # note the u
>>> print x
tʌˈmɑɹoʊ̯

如果您已经将字符串存储在变量中，则可以使用以下构造函数将字符串转换为unicode：

>>> s = 't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # your string has a unicode-escape encoding but is not unicode
>>> x = unicode(s, encoding='unicode-escape')
>>> print x
tʌˈmɑɹoʊ̯
>>> x
u't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # a unicode string

【讨论】：

@aspasia 很高兴为您提供帮助！
您的编辑正是我刚才试图弄清楚的，感谢您的深思熟虑。