【问题标题】:Getting " AttributeError: 'float' object has no attribute 'lower' "获取“ AttributeError:'float'对象没有属性'lower'”
【发布时间】:2021-03-17 00:55:15
【问题描述】:

我有一个用户 cmets 和评级数据集。我正在预处理此数据集,但出现如下错误。我该如何解决?

    def DataCleaning(metin):
     numbers = "0123456789"
     lower_case=metin.lower()
     punct_removed = [char for char in lower_case if char not in string.punctuation]
     punct_removed=[char for char in punct_removed if char not in numbers]
     punct_removed_join=''.join(punct_removed)
     punct_removed_join_clean = [word for word in punct_removed_join.split() if word not in 
     stopwords.words('english')]
     return punct_removed_join_clean


otel_verileri["reviews.text"] = otel_verileri["reviews.text"].apply(DataCleaning)
otel_verileri["reviews.text"].tolist()


OUTPUT:
AttributeError                            Traceback (most recent call last)
<ipython-input-56-a80b269d8bbe> in <module>()
      1 
----> 2 otel_verileri["reviews.text"] = otel_verileri["reviews.text"].apply(DataCleaning)
      3 otel_verileri["reviews.text"].tolist()

1 frames
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-48-748ef67e84ac> in DataCleaning(metin)
      1 def DataCleaning(metin):
      2  numbers = "0123456789"
----> 3  lower_case=metin.lower()
      4  punct_removed = [char for char in lower_case if char not in string.punctuation]
      5  punct_removed=[char for char in punct_removed if char not in numbers]

AttributeError: 'float' object has no attribute 'lower'

【问题讨论】:

标签: nlp tokenize data-cleaning sentiment-analysis preprocessor


【解决方案1】:

我猜你使用的是 pandas 库。我不知道你是否正在阅读一个 excel 文件,但我会假设它。

Pandas 似乎喜欢自己推断类型。你可以抑制它并要求一个特定的列只有 str 使用这个:

otel_verileri = pd.read_excel(file_name, converters={'reviews.text' : str})

(source: another answer on SO)

【讨论】:

    猜你喜欢
    • 2016-04-15
    • 2020-10-06
    • 2016-01-30
    • 2020-12-08
    • 2014-01-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-04
    相关资源
    最近更新 更多