【发布时间】:2021-07-16 17:38:28
【问题描述】:
我有两个 Pandas DataFrame,一个包含我要更新的数据,另一个提供基于 MultiIndex 键的查找以设置值。
例如,我有两个 csv:
fruit.csv
Fruit,Color,State,more,data
Apple,Red,Good,etc.,etc.
Apple,Green,Mouldy,etc.,etc.
Apple,Green,Excellent,etc.,etc.
Pear,Red,Excellent,etc.,etc.
Pear,Green,Good,etc.,etc.
Lime,Green,Bad,etc.,etc.
rating.csv
Fruit,State,Rating
Apple,Excellent,11
Apple,Good,8
Apple,Bad,4
Apple,Mouldy,0
Pear,Excellent,9
Pear,Good,5
Pear,Bad,2
Pear,Mouldy,1
Lime,Excellent,10
Lime,Good,7
Lime,Bad,5
Lime,Mouldy,2
我已读入 DataFrames:
static_data_dir = Path(__file__).resolve().parent
fruit = pd.read_csv(static_data_dir.joinpath("fruit.csv"), index_col=["Fruit","Color"])
rating = pd.read_csv(static_data_dir.joinpath("rating.csv"), index_col=["Fruit","State"])
State more data
Fruit Color
Apple Red Good etc. etc.
Green Mouldy etc. etc.
Green Excellent etc. etc.
Pear Red Excellent etc. etc.
Green Good etc. etc.
Lime Green Bad etc. etc.
Rating
Fruit State
Apple Excellent 11
Good 8
Bad 4
Mouldy 0
Pear Excellent 9
Good 5
Bad 2
Mouldy 1
Lime Excellent 10
Good 7
Bad 5
Mouldy 2
现在想将fruit DataFrame中的State值替换为rating DataFrame中的Rating值,结果如下。
State more data
Fruit Color
Apple Red 8 etc. etc.
Green 0 etc. etc.
Green 11 etc. etc.
Pear Red 9 etc. etc.
Green 5 etc. etc.
Lime Green 5 etc. etc.
实际上我想使用pandas.Series.replace,但传入一个带有元组键的字典,但这似乎不受支持。
{'Rating': {('Apple', 'Bad'): 4,
('Apple', 'Excellent'): 11,
('Apple', 'Good'): 8,
('Apple', 'Mouldy'): 0,
('Lime', 'Bad'): 5,
('Lime', 'Excellent'): 10,
('Lime', 'Good'): 7,
('Lime', 'Mouldy'): 2,
('Pear', 'Bad'): 2,
('Pear', 'Excellent'): 9,
('Pear', 'Good'): 5,
('Pear', 'Mouldy'): 1}}
我将如何最好地实现这一目标?
【问题讨论】:
标签: python pandas dataframe mapping multi-index