【发布时间】:2017-02-25 13:47:13
【问题描述】:
如何将 pandas 数据帧转换为 unicode?p>
`messages=pandas.read_csv('data/SMSSpamCollection',sep='\t',quoting=csv.QUOTE_NONE,names=["label", "message"])
def split_into_tokens(message):
message = unicode(message, 'utf8') # convert bytes into proper unicode
return TextBlob(message).words
messages.head().apply(split_into_tokens(messages))`
报错
Traceback (most recent call last):
File "minor.py", line 46, in <module>
messages.head().apply(split_into_tokens(messages))
File "minor.py", line 42, in split_into_tokens
message = unicode(message, 'utf8') # convert bytes into proper unicode
TypeError: coercing to Unicode: need string or buffer, DataFrame found
【问题讨论】:
-
尝试 messages.head().apply(split_into_tokens) 并运行并确保“应用”不适用于您需要传递的整个数据帧 df['column_name'].apply(some_function)
-
然后我将其添加为答案
标签: python-3.x pandas