将三个数据帧与 pandas 合并时收到关键错误答案

【问题标题】：Receiving a Key Error while merging three data frames with pandas将三个数据帧与 pandas 合并时收到关键错误
【发布时间】：2016-11-27 22:27:35
【问题描述】：

我正在尝试合并三个数据框。其中两个 df 使用“Country”，第三个使用“Country Name”。编辑* 下面是科学和能源的图像。

ScimEn

energy

谁能帮我找出我的关键错误在哪里？我知道这与 energy.csv 文件中的“国家”有关，但我不明白为什么会出现错误。

代码：

import pandas as pd
import numpy as np
energy = pd.read_csv('Energy Indicators.csv')
GDP = pd.read_csv('world_bank_new.csv')
columns_to_keep = ['Country Name','Country Code','Indicator Name','Indicator Code',
                   '2006','2007','2008','2009','2010','2011','2012','2013','2014','2015']
GDP = GDP[columns_to_keep]
SciEm = pd.read_csv('scimagojr-3.csv',encoding = "ISO-8859-1")



res = pd.merge(SciEm,energy,how='inner',on='Country').merge(GDP,how='inner',left_on='Country',right_on='Country Name').set_index('Country Name')
res.index.name = None

return pd.DataFrame(res,columns=dfcolumns).head(15)

错误：

KeyError                                  Traceback (most recent call last)
/opt/conda/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1944             try:
-> 1945                 return self._engine.get_loc(key)
   1946             except KeyError:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)()

KeyError: 'Country'

【问题讨论】：

查看第一次合并的输出。 res = pd.merge(SciEm,energy,how='inner',on='Country')那里有“国家”一栏吗？可能合并后变成了“Country_x”和“Country_y”？
如果我取出合并的第二部分并只留下您拥有的代码，则该代码不起作用。所以我认为能源文件中的 Country 正在做一些奇怪的事情，但我不知道是什么。
越来越近 - 打印出两者的列名：print energy.columns 和 print SciEm.columns。有时列名有一个尾随空格字符，或不同的大小写，或其他容易错过的东西。
对于 SciEm: Index(['Rank', 'Country', 'Documents', 'Citable documents', 'Citations', 'Self-citations', 'Citations per document', 'H index '], dtype='object') 对于能源：Index(['Country', '能源供应', '人均能源供应', '% Renewable'], dtype='object')
尝试删除代码的最后三行。替换为：res = pd.merge(SciEm,energy,how='inner',on='Country')。您可以将输出发布到您的问题吗？最后两行似乎无效。我想确保第一个合并有效，然后我们将添加第二个。

标签： python pandas merge keyerror

【解决方案1】：

正如我在评论中建议的那样，您可以将其作为回报：

return SciEm.merge(GDP, how='left', left_on='Country', right_on='Country Name').merge(energy, how='left', left_on='Country', right_on='Country').drop('Country Name', axis=1).set_index('Country').head(15)

【讨论】：

可能是您没有正确阅读文件吗？您可以发送您正在合并的文件的打印屏幕吗？
在原版中添加了图片。
既然您发现了问题所在，我的代码对您有用吗？

【解决方案2】：

错误出现在原始 .csv 文件中。我不小心将它保存为 .txt。

不过，奇怪的是文件的图像看起来不像 .txt。它看起来仍然像一个 .csv。

【讨论】：