【发布时间】:2019-03-08 08:31:28
【问题描述】:
我想知道我是否有以下格式的文件 我想把每一列放在一个列表中,因为我有不止一个句子: 所以输出看起来像这样
[['Learning centre of The University of Lahore is established for professional development.'],
['These events, destroyed the bond between them.']]
动词列也是如此。这是我尝试过的,但它将所有内容都放在一个列表中,而不是列表列表中
train_fn="/content/data/wiki/wiki1.train.oie"
dfE = pandas.read_csv(train_fn, sep= "\t",
header=0,
keep_default_na=False)
train_textEI = dfE['word'].tolist()
train_textEI = [' '.join(t.split()) for t in train_textEI]
train_textEI = np.array(train_textEI, dtype=object)[:, np.newaxis]
它输出列表中的每个单词
[['Learning'],['Center'],['of'],['The'],['University'],['of'],
['Lahore'],['is'],['established'],['for'],['the'],
['professional'],['development'],['.'],['These'],['events'],[','],
['destroyed'],['the'],['bond'],['between'],['them'],['.']]
【问题讨论】:
-
您需要
df.groupby('Verb')['word'].apply(lambda x: [' '.join(x)]).tolist()吗? -
@jazrael 但是如果两个连续的句子有相同的动词怎么办?我想它会合并 2 个句子,我尝试根据 wordId=0 进行拆分,但我做不到
-
所以
df.groupby(df['word_id'].eq(0).cumsum())['word'].apply(lambda x: [' '.join(x)]).tolist()?