【发布时间】:2021-04-16 05:21:01
【问题描述】:
我正在尝试运行此 XG Boost 模型进行文本分类,但是我遇到了定义问题。这是我的代码:
def preprocess(text_column):
"""
Function: This function aims to remove links, special
characters, symbols, stop words and thereafter
lemmatise each word in the sentence to transform
the dataset into something more usable for a
machine learning model.
Input: A text column
Returns: A text column (but transformed)
"""
new_review = []
for review in text_column:
text = re.sub("@\S+|https?:\S|[^A-Za-z0-9]+",'',str(review).lower()).strip()
text = [wnl.lemmatize(i) for i in text.split ('') if i not in stop_words]
new_review.append(''.join(text))
return new_review
# actually transforming the datasets
train['review'] = preprocess(train['review'])
test['review'] = preprocess(test['review'])
错误:
NameError Traceback (most recent call last)
<ipython-input-43-c0c3b2a57d42> in <module>()
1 new_review = []
----> 2 for review in text_column:
3 text = re.sub("@\S+|https?:\S|[^A-Za-z0-9]+",'',str(review).lower()).strip()
4 text = [wnl.lemmatize(i) for i in text.split ('') if i not in stop_words]
5 new_review.append(''.join(text))
NameError: name 'text_column' is not defined
如果我能做些什么来解决这个问题,请告诉我。谢谢。
【问题讨论】:
-
代码应该缩进以使其成为函数的一部分,目前不是。
标签: python syntax xgboost nameerror