【发布时间】:2021-05-20 02:55:57
【问题描述】:
我正在做一些 NLP 工作,我正在尝试使用 groupby 在 lambda 函数内执行发布请求,并得到一个 JSON 对象响应,不幸的是,结果为 NaN。我需要它在“爆炸”之后添加字段。
自定义函数:
def posTagger(text):
post = { "text": title }
endpoint = 'http://localhost:8001/api/postagger'
r = requests.post(endpoint, json=post)
r = r.json()
time.sleep(1)
return {"title": title, "result": r}
posTagger返回值:
[
{
"text": "Contemporary Modern Soft Area Rugs Nonslip",
"terms": [
{
"text": "Contemporary",
"penn": "JJ",
"tags": [
"Adjective"
]
},
{
"text": "Modern",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
},
{
"text": "Soft",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
},
{
"text": "Area",
"penn": "NN",
"tags": [
"Singular",
"Noun",
"ProperNoun"
]
},
{
"text": "Rugs",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Plural"
]
},
{
"text": "Nonslip",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
}
]
}
]
数据帧
title = [
'Contemporary Modern Soft Area Rugs Nonslip Velvet Home Room Carpet Floor Mat Rug',
'Traditional Distressed Area Rug 8x10 Large Rugs for Living Room 5x8 Gray Ivory',
'Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Carpet Living Room Bedroom Large Rug'
]
df = pd.DataFrame(title, columns=['title'])
df
# Initial dataframe:
# title
# 0 Contemporary Modern Soft Area Rugs Nonslip...
# 1 Traditional Distressed Area Rug 8x10 Large...
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft...
所以,这是我使用 .apply 的分组:
df['result'] = pd.DataFrame(df.groupby(['title']).apply(lambda x: posTagger(x)))
df
# Resulting DataFrame after **.apply**:
# title result
# 0 Contemporary Modern Soft Area Rugs Nonslip Vel... NaN
# 1 Traditional Distressed Area Rug 8x10 Large Rug... NaN
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Car... NaN
所以,这是我使用 .transform 的分组:
df['result'] = pd.DataFrame(df.groupby(['title']).transform(lambda x: posTagger(x)))
df
# Resulting DataFrame after **.transform**:
# title result
# 0 Contemporary Modern Soft Area Rugs Nonslip Vel... {'title': ['Contemporary Modern Soft Area Rugs...
# 1 Traditional Distressed Area Rug 8x10 Large Rug... {'title': ['Contemporary Modern Soft Area Rugs...
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Car... {'title': ['Contemporary Modern Soft Area Rugs...
注意,.transform 的结果多次发送相同的值。 为什么?
- 如何从自定义函数(返回具有嵌套数组的对象)中获取返回值,以分解形式将其添加到与新列相同的数据框中?
- 使用
.apply或.transform来实现这一点更好吗?
【问题讨论】:
标签: python pandas dataframe pandas-groupby