【发布时间】:2021-05-15 13:30:57
【问题描述】:
我有以下 2 个取自 excel 文件的数据框:
df_a = 10000 行(就像具有所有唯一 #s 的主列表)
df_b = 670 行
我正在加载一个包含 zip、地址、状态的 excel 文件 (df_b),我想匹配该信息,然后添加来自 df_a 的供应商#,这样我就可以拥有 1 个仍然是 670 行但现在有供应商行列。
df_a =
(10000 rows)
(unique)
supplier # ZIP ADDRESS STATE Unique Key
0 7100000 35481 14th street CA 35481-14th street-CA
1 7000005 45481 14th street CA 45481-14th street-CA
2 7000006 45482 140th circle CT 45482-140th circle-CT
3 7000007 35482 140th circle CT 35482-140th circle-CT
4 7000008 35483 13th road VT 35483-13th road-VT
df_b =
(670 rows)
ZIP ADDRESS STATE Unique Key
0 35481 14th street CA 35481-14th street-CA
1 45481 14th street CA 45481-14th street-CA
2 45482 140th circle CT 45482-140th circle-CT
3 35482 140th circle CT 35482-140th circle-CT
4 35483 13th road VT 35483-13th road-VT
OUTPUT:
df_c =
(670 rows)
ZIP ADDRESS STATE Unique Key (Unique)supplier #
0 35481 14th street CA 35481-14th street-CA 7100000
1 45481 14th street CA 45481-14th street-CA 7100005
2 45482 140th circle CT 45482-140th circle-CT 7100006
3 35482 140th circle CT 35482-140th circle-CT 7100007
4 35483 13th road VT 35483-13th road-VT 7100008
我尝试将 2 个 dfs 合并在一起,但它们不匹配,而是我得到了一堆 NAn
df10 = df_a.merge(df_b, on = 'Unique Key', how= 'left'
结果是 1 个数据框,其中包含很多列且没有匹配项。此外,我也尝试过 .map 和 .concat 。我不确定发生了什么。
【问题讨论】:
标签: python-3.x pandas dataframe