获取“ AttributeError：'float'对象没有属性'lower'”答案

【问题标题】：Getting " AttributeError: 'float' object has no attribute 'lower' "获取“ AttributeError：'float'对象没有属性'lower'”
【发布时间】：2021-03-17 00:55:15
【问题描述】：

我有一个用户 cmets 和评级数据集。我正在预处理此数据集，但出现如下错误。我该如何解决？

    def DataCleaning(metin):
     numbers = "0123456789"
     lower_case=metin.lower()
     punct_removed = [char for char in lower_case if char not in string.punctuation]
     punct_removed=[char for char in punct_removed if char not in numbers]
     punct_removed_join=''.join(punct_removed)
     punct_removed_join_clean = [word for word in punct_removed_join.split() if word not in 
     stopwords.words('english')]
     return punct_removed_join_clean


otel_verileri["reviews.text"] = otel_verileri["reviews.text"].apply(DataCleaning)
otel_verileri["reviews.text"].tolist()


OUTPUT:
AttributeError                            Traceback (most recent call last)
<ipython-input-56-a80b269d8bbe> in <module>()
      1 
----> 2 otel_verileri["reviews.text"] = otel_verileri["reviews.text"].apply(DataCleaning)
      3 otel_verileri["reviews.text"].tolist()

1 frames
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-48-748ef67e84ac> in DataCleaning(metin)
      1 def DataCleaning(metin):
      2  numbers = "0123456789"
----> 3  lower_case=metin.lower()
      4  punct_removed = [char for char in lower_case if char not in string.punctuation]
      5  punct_removed=[char for char in punct_removed if char not in numbers]

AttributeError: 'float' object has no attribute 'lower'

【问题讨论】：

请阅读Under what circumstances may I add “urgent” or other similar phrases to my question, in order to obtain faster answers? - 总结是这不是解决志愿者的理想方式，并且可能会适得其反。请不要将此添加到您的问题中。
assert isinstance(metin, str), repr(metin) 将其放在发生错误的行上方。运行。看看哪个值违反了您的期望。出于某种原因，您的 reviews.text 列不只包含文本。这里有一些自动转换吗？

标签： nlp tokenize data-cleaning sentiment-analysis preprocessor

【解决方案1】：

我猜你使用的是 pandas 库。我不知道你是否正在阅读一个 excel 文件，但我会假设它。

Pandas 似乎喜欢自己推断类型。你可以抑制它并要求一个特定的列只有 str 使用这个：

otel_verileri = pd.read_excel(file_name, converters={'reviews.text' : str})

(source: another answer on SO)

【讨论】：