【问题标题】:What mistake should I correct?我应该纠正什么错误?
【发布时间】:2020-04-18 00:54:57
【问题描述】:

你能告诉我我犯了什么错误吗?

将熊猫导入为 pd 将 numpy 导入为 np

states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': '美属萨摩亚', 'NV': '内华达','WY':'怀俄明','NA':'国家','AL':'阿拉巴马','MD':'马里兰','AK':'阿拉斯加','UT':'犹他','OR':'俄勒冈','MT':'蒙大拿','IL':'伊利诺伊','TN':'田纳西','DC':'哥伦比亚特区','VT':'佛蒙特州','ID':'爱达荷州','AR':'阿肯色州','ME':'缅因州','WA':'华盛顿','HI':'夏威夷','WI':'威斯康星州', 'MI':'密歇根','IN':'印第安纳','NJ':'新泽西','AZ':'亚利桑那','GU':'关岛','MS':'密西西比',' PR':'Puerto Rico','NC':'North Carolina','TX':'Texas','SD':'South Dakota','MP':'Northern Mariana Islands','IA':'Iowa ','MO':'密苏里','CT':'康涅狄格','WV':'西弗吉尼亚','SC':'南卡罗来纳','LA':'路易斯安那','KS':'堪萨斯','NY':'纽约','NE':'内布拉斯加州','OK':'俄克拉荷马','FL':'佛罗里达','CA':'加利福尼亚','CO':'科罗拉多' ,'PA':'宾夕法尼亚','DE':'特拉华','NM':'新墨西哥','RI':'罗德岛','MN':'明尼苏达','VI':'维尔京群岛','NH':'新罕布什尔州','MA':'马萨诸塞州','GA':'乔治亚州','ND':'北大kota', 'VA': '弗吉尼亚'}

def convert_housing_data_to_quarters():
    '''Converts the housing data to quarters and returns it as mean
    values in a dataframe. This dataframe should be a dataframe with
    columns for 2000q1 through 2016q3, and should have a multi-index
    in the shape of ["State","RegionName"].

    Note: Quarters are defined in the assignment description, they are
    not arbitrary three month periods.

    The resulting dataframe should have 67 columns, and 10,730 rows.
    '''


    df = pd.read_csv('City_Zhvi_AllHomes.csv', header=0)


    cols_to_keep = ['RegionID', 'RegionName', 'State']
    for i in range(2000, 2017):
        for j in range(1, 13):
            if j <= 9:
                if i == 2016 and j == 9:
                    pass
                else:
                    month_str = '0' + str(j)
            else:
                if i == 2016:
                    pass
                else:
                    month_str = str(j)
            cols_to_keep.append(str(i) + '-' + month_str)
    df = df[cols_to_keep]


    df['State'] = df['State'].replace(states)

    def convert_to_qtr(ym):
        year, month = ym.split('-')
        if month == '01' or month == '02' or month == '03':
            result = year + 'q1'
        elif month == '04' or month == '05' or month == '06':
            result = year + 'q2'
        elif month == '07' or month == '08' or month == '09':
            result = year + 'q3'
        else:
            result = year + 'q4'
        return result


    df_compiled = df.copy().set_index(['State', 'RegionName', 'RegionID']).stack(dropna=False)
    df_compiled = df_compiled.reset_index().rename(columns={'level_3': 'year_month', 0: 'gdp'})
    df_compiled.drop_duplicates(inplace=True)
    df_compiled['quarter'] = df_compiled['year_month'].apply(convert_to_qtr)
    df_compiled = df_compiled.drop('year_month', axis=1)
    result = df_compiled.pivot_table(values='gdp', index=['State', 'RegionName', 'RegionID'], columns='quarter', aggfunc=np.mean)
    result = result.reset_index()
    result = result.drop('RegionID', axis=1)
    #del result.index.name
    result = result.set_index(['State', 'RegionName'])
    return result

convert_housing_data_to_quarters()

这是我在 Python 3.8 中遇到的错误:

Traceback (most recent call last):
  File "C:/Users/Esteban Andrino/AppData/Local/Programs/Python/Python38-32/K2 Michi_gan/Week_4/Assignment final_4/AssignmentFINAL222_4.py", line 232, in <module>
    convert_housing_data_to_quarters()
  File "C:/Users/Esteban Andrino/AppData/Local/Programs/Python/Python38-32/K2 Michi_gan/Week_4/Assignment final_4/AssignmentFINAL222_4.py", line 202, in convert_housing_data_to_quarters
    df = df[cols_to_keep]
  File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\frame.py", line 2806, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexing.py", line 1552, in _get_listlike_indexer
    self._validate_read_indexer(
  File "C:\Users\Esteban Andrino\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexing.py", line 1646, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['2003-08', '2008-11', '2012-11', '2003-10', '2006-01', '2009-06', '2008-06', '2009-08', '2007-05', '2009-11', '2012-01', '2006-11', '2009-12', '2007-07', '2001-03', '2004-07', '2005-06', '2007-04', '2016-08', '2002-12', '2007-10', '2008-10', '2003-01', '2001-02', '2011-07', '2012-09', '2014-11', '2015-07', '2001-08', '2003-11', '2015-02', '2007-02', '2015-12', '2011-01', '2013-04', '2011-06', '2014-02', '2012-10', '2013-01', '2010-09', '2009-09', '2011-12', '2014-03', '2013-06', '2013-12', '2001-06', '2001-12', '2011-10', '2009-01', '2015-01', '2004-09', '2001-10', '2000-06', '2009-03', '2006-05', '2011-11', '2014-07', '2004-03', '2008-12', '2010-11', '2011-02', '2009-05', '2002-09', '2012-02', '2004-08', '2003-02', '2016-02', '2003-12', '2005-01', '2005-09', '2006-04', '2008-03', '2012-12', '2000-03', '2005-02', '2005-08', '2006-07', '2013-08', '2016-05', '2002-03', '2007-08', '2015-11', '2016-06', '2001-01', '2006-09', '2010-06', '2014-01', '2014-10', '2003-07', '2014-06', '2016-07', '2004-04', '2010-02', '2002-08', '2005-12', '2004-12', '2000-04', '2006-08', '2010-08', '2011-03', '2004-02', '2000-12', '2001-04', '2000-11', '2012-05', '2002-05', '2009-07', '2011-04', '2000-10', '2011-09', '2002-02', '2004-10', '2005-05', '2007-01', '2007-03', '2010-01', '2012-06', '2014-08', '2015-03', '2014-09', '2013-02', '2013-10', '2006-06', '2003-05', '2012-08', '2001-11', '2005-10', '2002-01', '2013-03', '2011-05', '2000-07', '2001-09', '2012-03', '2008-04', '2014-04', '2000-09', '2009-10', '2006-03', '2000-02', '2015-08', '2012-07', '2013-11', '2003-09', '2010-12', '2000-08', '2015-05', '2015-06', '2012-04', '2015-09', '2007-12', '2009-04', '2002-07', '2013-07', '2014-05', '2008-05', '2016-01', '2007-11', '2000-05', '2016-04', '2010-04', '2008-02', '2004-05', '2005-11', '2008-09', '2010-10', '2015-04', '2006-12', '2007-09', '2013-09', '2009-02', '2010-07', '2016-03', '2002-06', '2002-11', '2002-04', '2003-06', '2010-05', '2011-08', '2002-10', '2008-08', '2003-03', '2004-11', '2005-04', '2015-10', '2007-06', '2004-06', '2010-03', '2005-03', '2000-01', '2008-01', '2005-07', '2008-07', '2006-10', '2006-02', '2013-05', '2001-05', '2004-01', '2003-04', '2014-12', '2001-07'] not in index"

【问题讨论】:

  • 现在,除了你之外,没有人拥有输入的 csv 文件,没有它就无法测试代码(或任何修复)。一个好的minimal reproducible example 是自包含的,它是最短的东西,可以在没有更改或添加的情况下自行运行时重现问题。需要创建数据非常重要。
  • 你做过调试吗?请参阅How to Askhelp center
  • 您好!信息位于这里:zillow.com/research/data 在选项“数据类型”中,您应该选择第一个选项。

标签: python python-3.x pandas numpy dataframe


【解决方案1】:

行内:

df = df[cols_to_keep]

cols_to_keep 是一个列表,内容为:['2003-08', '2008-11', '2012-11', '2003-10', '2006-01'.... etc ],它不能很好地用作键。您可能希望将该列表中的元素一一用作键,而不是整个列表。

【讨论】:

    猜你喜欢
    • 2016-06-23
    • 2016-01-06
    • 2021-06-05
    • 2011-11-05
    • 2016-04-02
    • 2021-06-10
    • 1970-01-01
    • 2022-12-07
    • 2023-03-14
    相关资源
    最近更新 更多