【问题标题】:How to sort a dataframe using a list of column names如何使用列名列表对数据框进行排序
【发布时间】:2018-07-10 11:20:41
【问题描述】:

我在 Python 2.7 上使用 Pandas 0.22.0,以 PyCharm 作为 IDE。

我正在尝试使用循环对多个数据框进行排序。这些数据帧是从 .csv 文件创建的,然后使用 pandas 中的“xlsxwriter”转换为 xlsx。

我创建了一个排序列表,其中包含所有排序要求,因此当我运行循环时,它将获取一个 csv 文件,将其转换为数据框,“排序”(我在卡住了),然后将整个内容输出为 .xlsx 文件,以便可以在 MSEXCEL 中播放。

如果我使用df = df.sort_values(by=['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME']) 则没有问题。

但是,如果我使用这个:df = df.sort_values(by=sorts[0]),代码就会崩溃。

    Traceback (most recent call last):
      File "D:/OneDrive/Programming Practice/Python/Rubaiyat/test1.py", line 55, in <module>
        df = df.sort_values(by=(sorts[0]))
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 3619, in sort_values
        k = self.xs(by, axis=other_axis).values
      File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 2335, in xs
        return self[key]
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
        return self._getitem_column(key)
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
        return self._get_item_cache(key)
      File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
        values = self._data.get(item)
      File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3843, in get
        loc = self.items.get_loc(item)
      File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
      File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
      File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
      File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
    KeyError: "'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'"

整个代码如下:

    import pandas
    import sys

    reload(sys)
    sys.setdefaultencoding('utf-8')

    reportDF = ["assetReport", "assetTypeReport", "assetStatusReport", "locationReport", "departmentReport", "siteReport",
                "userReport"]

    sheetNames = ["Asset Report", "Asset Types", "Asset Status", "Locations", "Cost Centers", "Sites", "Users"]


    columnNames = [("EPC", "Creation Date", "Modification Date", "Inventory Date", "Asset Name", "Asset Status",
                    "Asset Type", "Asset User", "Location", "Site", "Cost Center", "Description"),
                "Asset Type Name",
                ("Asset Status", "Asset Status Description"),
                ("Location Name", "EPC", "Floor", "GPS", "Capacity", "Lead Time", "Site Name"),
                "Cost Center",
                ("Site", "Country", "Postal Code", "City", "Address", "GPS"),
                ("User Name", "User Role", "First Name", "Last Name", "Email", "User Disabled?")]

    sorts = ["'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'",
            'ASSET_TYPE_NAME', 
            'ASSET_STATUS_NAME',
            "'SITE_NAME', 'LOCATION_NAME'",
            'DEPARTMENT_NAME',
            'SITE_NAME',
            'USER_NAME']

    writer = pandas.ExcelWriter('mergedSheet.xlsx')

    for i in range(0, 7):
        df = pandas.read_csv(reportDF[i], delimiter=';')
        df = df.sort_values(by=sorts[i])
        df.to_excel(writer, sheet_name=sheetNames[i], engine='xlsxwriter', header=columnNames[i], freeze_panes=(1, 0))

    writer.save()
    writer.close()

非常感谢任何帮助或指导。 谢谢。

【问题讨论】:

    标签: python pandas csv sorting dataframe


    【解决方案1】:

    您创建了一个字符串:"'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'"

    我认为应该是这样的:

    sorts = [['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'],
            'ASSET_TYPE_NAME', 
            'ASSET_STATUS_NAME',
            ['SITE_NAME', 'LOCATION_NAME'],
            'DEPARTMENT_NAME',
            'SITE_NAME',
            'USER_NAME']
    

    【讨论】:

    • 这成功了!谢谢!奇怪的是,这是我第一次看到这种写列表的方式,奇怪的是,另一个列表“columnNames”只是用括号分隔,而且它们也在工作。这种工作有理由而不是其他方式吗?再次感谢!
    猜你喜欢
    • 1970-01-01
    • 2011-11-12
    • 2019-08-31
    • 2019-03-07
    • 2020-04-23
    • 1970-01-01
    • 2022-01-17
    相关资源
    最近更新 更多