【发布时间】:2019-11-23 03:36:08
【问题描述】:
目标
将deid_notes函数应用到df
背景
我有一个类似于此示例df 的df
import pandas as pd
df = pd.DataFrame({'Text' : ['there are many different types of crayons',
'i like a lot of sports cares',
'the middle east has many camels '],
'P_ID': [1,2,3],
'Word' : ['crayons', 'cars', 'camels'],
'P_Name' : ['John', 'Mary', 'Jacob'],
'N_ID' : ['A1', 'A2', 'A3']
})
#rearrange columns
df = df[['Text','N_ID', 'P_ID', 'P_Name', 'Word']]
df
Text N_ID P_ID P_Name Word
0 many types of crayons A1 1 John crayons
1 i like sports cars A2 2 Mary cars
2 has many camels A3 3 Jacob camels
我使用以下函数使用 NeuroNER http://neuroner.com/ 对 Text 列中的某些单词进行去标识化
def deid_notes(text):
#use predict function from neuorNER to tag words to be deidentified
ner_list = n1.predict(text)
#n1.predict wont work in this toy example because neuroNER package needs to be installed (and installation is difficult)
#but the output resembles this: [{'start': 1, 'end:' 11, 'id': 1, 'tagged word': crayon}]
#use start and end position of tagged words to deidentify and replace with **BLOCK**
if len(ner_list) > 0:
parts_to_take = [(0, ner_list[0]['start'])] + [(first["end"]+1, second["start"]) for first, second in zip(ner_list, ner_list[1:])] + [(ner_list[-1]['end'], len(text)-1)]
parts = [text[start:end] for start, end in parts_to_take]
deid = '**BLOCK**'.join(parts)
#if n1.predict does not identify any words to be deidentified, place NaN
else:
deid='NaN'
return pd.Series(deid, index='Deid')
问题
我使用以下代码将deid_notes 函数应用于我的df
fx = lambda x: deid_notes(x.Text,axis=1)
df.join(df.apply(fx))
但我收到以下错误
AttributeError: ("'Series' object has no attribute 'Text'", 'occurred at index Text')
问题
如何让deid_notes 函数在我的df 上工作?
【问题讨论】:
-
在这种情况下
n1是什么? -
n1=neuromodel.NeuroNER(train_model=False, use_pretrained_model=True, dataset_text_folder="./data/example_unannotated_texts", pretrained_model_folder="./trained_models/mimic_glove_stanford_bioes") -
试试
df.join(df.apply(fx, axis=1)) -
我收到一个错误
TypeError: ("deid_notes() got an unexpected keyword argument 'axis'", 'occurred at index 0')
标签: python-3.x pandas join lambda apply