如何将列表函数应用于pandas df中的文本生成器obj答案

【问题标题】：How to apply list function to textacy generator obj in pandas df如何将列表函数应用于pandas df中的文本生成器obj
【发布时间】：2025-12-23 23:20:43
【问题描述】：

我正在将“list”函数应用于包含生成器对象的 pandas col，以尝试在 col 中显示所有生成器对象。应用时，col 返回空列表。 'subject_verb_object_triples' 是一个文本功能 (https://chartbeat-labs.github.io/textacy/_modules/textacy/extract.html)

打印（sp500news3）

date_publish    title
79944   2007-01-29 19:08:35 <generator object subject_verb_object_triples at 0x1a42713550>
181781  2007-12-14 19:39:06 <generator object subject_verb_object_triples at 0x1a42713410>
213175  2008-01-22 11:17:19 <generator object subject_verb_object_triples at 0x1a427135f0>
93554   2008-01-22 18:52:56 <generator object subject_verb_object_triples at 0x1a427135a0>

In []: sp500news3["title"].apply(list)
Out []: 79944     []
        181781    []
        213175    [] ...

预期的输出是元组，如下所示：

[(Sky proposal, is, matter), (Sky proposal, is, Mays spokesman)], 
[(Women, lag, Intel report)], 
[(Amazon, expected, to unveil)], 
[(Goldman Sachs, raising, billion)], 
[(MHP, opens, books)], 
[(Disney, hurls, magic), (Disney, hurls, moolah)], 
[(Amazon, offering, loans), (Amazon, offering, to)], ....

如何在我的数据框中显示预期的输出？

【问题讨论】：

预期输出是什么？有问题吗？我们能提供什么帮助？
@BenoîtPilatte - 已更新 q
您可以在这里使用lambda 吗？ lambda x: [a for a in x]
你可能想要sp500news3["title"].apply(lambda x: list(x)
@JoshFriedlander 这仍然返回空列表

标签： python pandas nlp spacy textacy

【解决方案1】：

我已经测试了下面的代码，它工作正常

import textacy
import pandas as pd
from textacy import preprocessing
pd.options.display.max_colwidth=-1
df['<New Column name'>]=df['<Your column name that needs to be processed>'].apply(lambda x:preprocessing.normalize_whitespace(preprocessing.remove_punctuation(str(x))))

【讨论】：