'float' 类型错误 Python，熊猫答案

【问题标题】：'float' Type Error Python, pandas'float' 类型错误 Python，熊猫
【发布时间】：2018-05-07 23:26:37
【问题描述】：

在使用 unicode 对数据框中的列进行迭代时 - 字符串数据（dtype 对象）出现以下错误：

in text_pre_processing(text)  
2 # removing punctuation  
3 #text = text1(r'\n',' ', regex=True)  
----> 4 text1 = [char for char in text if char not in string.punctuation]  
5 text1 = ''.join(text1)  


**TypeError: 'float' object is not iterable**

使用的功能

def text_pre_processing(text):
    # removing punctuation
    #text1 = text1(r'\n',' ', regex=True)
    text1 =  [char for char in str(text) if char not in string.punctuation]
    text1 = ''.join(text1)

    # removing all the stop words from corpus 

    #return text.split()
    return[word for word in text1.split() if word not in stopwords.words('english')]

我试图查看输入函数的列是否有任何浮点值（只有浮点值的句子）但未能这样做，因为“pandas”将 alfa 数字和 alpha 值视为数据类型“对象”，显式类型投射不起作用。

有人知道出了什么问题吗？

我将此函数用作 naivebayes 算法分析器的一部分。

数据：第 1 列是索引

Column2

this is a good movie...#    

this is a bad movie $....     

this #movie was good ;) but some scenes were exaggerating

预期输出：

[this, good, movie]    
[this, bad, movie ]    
[this, movie, good, some, scenes, were, exaggerating]

【问题讨论】：

您可以将text 包装回字符串：[char for char in str(text) if char not in string.punctuation]
为什么要遍历列？我闻到了 XY 问题。请显示您的数据和您的预期输出。就性能而言，迭代是您可以对数据框做的最糟糕的事情。我 99% 确定 pd.Series.str.replace 更适合您的问题。
@hoefling 我试过这个但它仍然没有用......并且还尝试将列显式转换为字符串 D1['column'] = D1['column'].astype(str)跨度>
@cᴏʟᴅsᴘᴇᴇᴅ 我对问题进行了一些更改，希望现在很清楚。

标签： python string machine-learning scikit-learn typeerror

【解决方案1】：

你需要把一个浮点数转成字符串：

>>> str(3.14159)
'3.14159'

【讨论】：