【发布时间】:2020-03-11 16:22:04
【问题描述】:
我有一些数据要清理,我想删除一些键,其中键有六个前导零,如果键不以“ABC”结尾或不以“DEFG”结尾,那么我需要清理最后 3 个索引中的货币代码。如果键不以前导零开头,则直接返回键。
为此,我编写了一个处理字符串的函数,如下所示:
def cleanAttainKey(dirtyAttainKey):
if dirtyAttainKey[0] != "0":
return dirtyAttainKey
else:
dirtyAttainKey = dirtyAttainKey.strip("0")
if dirtyAttainKey[-3:] != "ABC" and dirtyAttainKey[-3:] != "DEFG":
dirtyAttainKey = dirtyAttainKey[:-3]
cleanAttainKey = dirtyAttainKey
return cleanAttainKey
现在我构建了一个虚拟数据框来测试它,但它报告错误:
- 数据框
df = pd.DataFrame({'dirtyKey':["00000012345ABC","0000012345DEFG","0000023456DEFGUSD"],'amount':[100,101,102]},
columns=["dirtyKey","amount"])
- 我需要在 df 中获取一个名为“cleanAttainKey”的新列,然后使用“cleanAttainKey”函数修改“dirtyKey”中的每个值,然后将清理后的密钥分配给新列“cleanAttainKey”,但是看起来pandas 不支持这种类型的修改。
# add a new column in df called cleanAttainKey
df['cleanAttainKey'] = ""
# I want to clean the keys and get into the new column of cleanAttainKey
dirtyAttainKeyList = df['dirtyKey'].tolist()
for i in range(len(df['cleanAttainKey'])):
df['cleanAttainKey'][i] = cleanAttainKey(vpAttainKeyList[i])
我收到以下错误消息:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
结果应该和下面的df2一样:
df2 = pd.DataFrame({'dirtyKey':["00000012345ABC","0000012345DEFG","0000023456DEFGUSD"],'amount':[100,101,102],
'cleanAttainKey':["12345ABC","12345DEFG","23456DEFG"]},
columns=["dirtyKey","cleanAttainKey","amount"])
df2
有没有更好的方法来修改脏键并在 Pandas 中使用干净键获得一个新列? 谢谢
【问题讨论】: