这是一个完成这项工作的管道。总之,诀窍是先melt df1,然后merge 两个数据帧,并使用值的差异来获得最接近的匹配。其余大部分用于格式化。
(df2.merge((df1.rename_axis('df1_Pos') # reshape df1 to long format
.reset_index() #
.melt(id_vars=['df1_Pos', 'Sr'], #
var_name='W', #
value_name='T') #
),
on=['W'], # merge with df2 on "W"
suffixes=['', '1'])
.assign(diff=lambda d: abs(d['T']-d['T1'])) # compute the diff of "T"
.rename(columns={'T1': 'df1Closest_Val',
'Sr': 'df1_Sr'})
.sort_values(by='diff') # sort diff to have min diff first
.drop('diff', axis=1)
.groupby('W').first() # keep first row per group (= min diff)
.reset_index()
)
输出:
W T df1_Pos df1_Sr df1Closest_Val
0 W1 20.200 0 1000 20.100
1 W2 19.119 2 1004 19.115
2 W3 18.100 3 1009 18.000
故障
重塑 df1:
>>> df1b = df1.reset_index().melt(id_vars=['index', 'Sr'], var_name='W', value_name='T')
>>> df1b
index Sr W T
0 0 1000 W1 20.100
1 1 1002 W1 20.300
2 2 1004 W1 19.100
3 3 1009 W1 18.500
4 0 1000 W2 45.155
5 1 1002 W2 45.180
6 2 1004 W2 19.115
7 3 1009 W2 19.126
8 0 1000 W3 20.000
9 1 1002 W3 22.000
10 2 1004 W3 19.000
11 3 1009 W3 18.000
合并:
>>> df2b = df2.merge(df1b, on=['W'], suffixes=['', '1']).assign(diff=lambda d: abs(d['T']-d['T1']))
>>> df2b
W T index Sr T1 diff
0 W1 20.200 0 1000 20.100 0.100
1 W1 20.200 1 1002 20.300 0.100
2 W1 20.200 2 1004 19.100 1.100
3 W1 20.200 3 1009 18.500 1.700
4 W2 19.119 0 1000 45.155 26.036
5 W2 19.119 1 1002 45.180 26.061
6 W2 19.119 2 1004 19.115 0.004
7 W2 19.119 3 1009 19.126 0.007
8 W3 18.100 0 1000 20.000 1.900
9 W3 18.100 1 1002 22.000 3.900
10 W3 18.100 2 1004 19.000 0.900
11 W3 18.100 3 1009 18.000 0.100
对值进行排序、分组并取最小差异:
>>> df2b.sort_values(by='diff').groupby('W').first().reset_index()
W T index Sr T1 diff
0 W1 20.200 0 1000 20.100 0.100
1 W2 19.119 2 1004 19.115 0.004
2 W3 18.100 3 1009 18.000 0.100