【发布时间】:2020-04-18 00:54:57
【问题描述】:
你能告诉我我犯了什么错误吗?
将熊猫导入为 pd 将 numpy 导入为 np
states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': '美属萨摩亚', 'NV': '内华达','WY':'怀俄明','NA':'国家','AL':'阿拉巴马','MD':'马里兰','AK':'阿拉斯加','UT':'犹他','OR':'俄勒冈','MT':'蒙大拿','IL':'伊利诺伊','TN':'田纳西','DC':'哥伦比亚特区','VT':'佛蒙特州','ID':'爱达荷州','AR':'阿肯色州','ME':'缅因州','WA':'华盛顿','HI':'夏威夷','WI':'威斯康星州', 'MI':'密歇根','IN':'印第安纳','NJ':'新泽西','AZ':'亚利桑那','GU':'关岛','MS':'密西西比',' PR':'Puerto Rico','NC':'North Carolina','TX':'Texas','SD':'South Dakota','MP':'Northern Mariana Islands','IA':'Iowa ','MO':'密苏里','CT':'康涅狄格','WV':'西弗吉尼亚','SC':'南卡罗来纳','LA':'路易斯安那','KS':'堪萨斯','NY':'纽约','NE':'内布拉斯加州','OK':'俄克拉荷马','FL':'佛罗里达','CA':'加利福尼亚','CO':'科罗拉多' ,'PA':'宾夕法尼亚','DE':'特拉华','NM':'新墨西哥','RI':'罗德岛','MN':'明尼苏达','VI':'维尔京群岛','NH':'新罕布什尔州','MA':'马萨诸塞州','GA':'乔治亚州','ND':'北大kota', 'VA': '弗吉尼亚'}
def convert_housing_data_to_quarters():
'''Converts the housing data to quarters and returns it as mean
values in a dataframe. This dataframe should be a dataframe with
columns for 2000q1 through 2016q3, and should have a multi-index
in the shape of ["State","RegionName"].
Note: Quarters are defined in the assignment description, they are
not arbitrary three month periods.
The resulting dataframe should have 67 columns, and 10,730 rows.
'''
df = pd.read_csv('City_Zhvi_AllHomes.csv', header=0)
cols_to_keep = ['RegionID', 'RegionName', 'State']
for i in range(2000, 2017):
for j in range(1, 13):
if j <= 9:
if i == 2016 and j == 9:
pass
else:
month_str = '0' + str(j)
else:
if i == 2016:
pass
else:
month_str = str(j)
cols_to_keep.append(str(i) + '-' + month_str)
df = df[cols_to_keep]
df['State'] = df['State'].replace(states)
def convert_to_qtr(ym):
year, month = ym.split('-')
if month == '01' or month == '02' or month == '03':
result = year + 'q1'
elif month == '04' or month == '05' or month == '06':
result = year + 'q2'
elif month == '07' or month == '08' or month == '09':
result = year + 'q3'
else:
result = year + 'q4'
return result
df_compiled = df.copy().set_index(['State', 'RegionName', 'RegionID']).stack(dropna=False)
df_compiled = df_compiled.reset_index().rename(columns={'level_3': 'year_month', 0: 'gdp'})
df_compiled.drop_duplicates(inplace=True)
df_compiled['quarter'] = df_compiled['year_month'].apply(convert_to_qtr)
df_compiled = df_compiled.drop('year_month', axis=1)
result = df_compiled.pivot_table(values='gdp', index=['State', 'RegionName', 'RegionID'], columns='quarter', aggfunc=np.mean)
result = result.reset_index()
result = result.drop('RegionID', axis=1)
#del result.index.name
result = result.set_index(['State', 'RegionName'])
return result
convert_housing_data_to_quarters()
这是我在 Python 3.8 中遇到的错误:
Traceback (most recent call last):
File "C:/Users/Esteban Andrino/AppData/Local/Programs/Python/Python38-32/K2 Michi_gan/Week_4/Assignment final_4/AssignmentFINAL222_4.py", line 232, in <module>
convert_housing_data_to_quarters()
File "C:/Users/Esteban Andrino/AppData/Local/Programs/Python/Python38-32/K2 Michi_gan/Week_4/Assignment final_4/AssignmentFINAL222_4.py", line 202, in convert_housing_data_to_quarters
df = df[cols_to_keep]
File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\frame.py", line 2806, in __getitem__
indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexing.py", line 1552, in _get_listlike_indexer
self._validate_read_indexer(
File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexing.py", line 1646, in _validate_read_indexer
raise KeyError(f"{not_found} not in index")
KeyError: "['2003-08', '2008-11', '2012-11', '2003-10', '2006-01', '2009-06', '2008-06', '2009-08', '2007-05', '2009-11', '2012-01', '2006-11', '2009-12', '2007-07', '2001-03', '2004-07', '2005-06', '2007-04', '2016-08', '2002-12', '2007-10', '2008-10', '2003-01', '2001-02', '2011-07', '2012-09', '2014-11', '2015-07', '2001-08', '2003-11', '2015-02', '2007-02', '2015-12', '2011-01', '2013-04', '2011-06', '2014-02', '2012-10', '2013-01', '2010-09', '2009-09', '2011-12', '2014-03', '2013-06', '2013-12', '2001-06', '2001-12', '2011-10', '2009-01', '2015-01', '2004-09', '2001-10', '2000-06', '2009-03', '2006-05', '2011-11', '2014-07', '2004-03', '2008-12', '2010-11', '2011-02', '2009-05', '2002-09', '2012-02', '2004-08', '2003-02', '2016-02', '2003-12', '2005-01', '2005-09', '2006-04', '2008-03', '2012-12', '2000-03', '2005-02', '2005-08', '2006-07', '2013-08', '2016-05', '2002-03', '2007-08', '2015-11', '2016-06', '2001-01', '2006-09', '2010-06', '2014-01', '2014-10', '2003-07', '2014-06', '2016-07', '2004-04', '2010-02', '2002-08', '2005-12', '2004-12', '2000-04', '2006-08', '2010-08', '2011-03', '2004-02', '2000-12', '2001-04', '2000-11', '2012-05', '2002-05', '2009-07', '2011-04', '2000-10', '2011-09', '2002-02', '2004-10', '2005-05', '2007-01', '2007-03', '2010-01', '2012-06', '2014-08', '2015-03', '2014-09', '2013-02', '2013-10', '2006-06', '2003-05', '2012-08', '2001-11', '2005-10', '2002-01', '2013-03', '2011-05', '2000-07', '2001-09', '2012-03', '2008-04', '2014-04', '2000-09', '2009-10', '2006-03', '2000-02', '2015-08', '2012-07', '2013-11', '2003-09', '2010-12', '2000-08', '2015-05', '2015-06', '2012-04', '2015-09', '2007-12', '2009-04', '2002-07', '2013-07', '2014-05', '2008-05', '2016-01', '2007-11', '2000-05', '2016-04', '2010-04', '2008-02', '2004-05', '2005-11', '2008-09', '2010-10', '2015-04', '2006-12', '2007-09', '2013-09', '2009-02', '2010-07', '2016-03', '2002-06', '2002-11', '2002-04', '2003-06', '2010-05', '2011-08', '2002-10', '2008-08', '2003-03', '2004-11', '2005-04', '2015-10', '2007-06', '2004-06', '2010-03', '2005-03', '2000-01', '2008-01', '2005-07', '2008-07', '2006-10', '2006-02', '2013-05', '2001-05', '2004-01', '2003-04', '2014-12', '2001-07'] not in index"
【问题讨论】:
-
现在,除了你之外,没有人拥有输入的 csv 文件,没有它就无法测试代码(或任何修复)。一个好的minimal reproducible example 是自包含的,它是最短的东西,可以在没有更改或添加的情况下自行运行时重现问题。需要创建数据非常重要。
-
你做过调试吗?请参阅How to Ask、help center。
-
您好!信息位于这里:zillow.com/research/data 在选项“数据类型”中,您应该选择第一个选项。
标签: python python-3.x pandas numpy dataframe