【问题标题】:ValueError: Shape of passed values is (37679, 43), indices imply (37679, 41)ValueError:传递值的形状为 (37679, 43),索引暗示 (37679, 41)
【发布时间】:2021-05-12 15:24:48
【问题描述】:

我正在尝试按种族对马匹数据进行分组。我正在使用数据透视函数来尝试执行此操作,但我不断收到值错误。

def group_horse_and_result(element):
    if element[0] == 'placing':
        return 100 + element[1]
    else:
        return element[1]   

data = data.pivot(index='id', columns='barrier', values=data.columns[2:])
rearranged_columns = sorted(list(data.columns.values), key=group_horse_and_result)
data = data[rearranged_columns]
print(data.head())

data.fillna(0)

我不断收到这个错误结果:

AssertionError                            Traceback (most recent call last)
<ipython-input-253-97da160dc172> in <module>
      5         return element[1]
      6 
----> 7 data = data.pivot(index='race_id', columns='placing', values=data.columns[2:])
      8 rearranged_columns = sorted(list(data.columns.values), key=group_horse_and_result)
      9 data = data[rearranged_columns]

~\anaconda3\lib\site-packages\pandas\core\frame.py in pivot(self, index, columns, values)
   6672         from pandas.core.reshape.pivot import pivot
   6673 
-> 6674         return pivot(self, index=index, columns=columns, values=values)
   6675 
   6676     _shared_docs[

~\anaconda3\lib\site-packages\pandas\core\reshape\pivot.py in pivot(data, index, columns, values)
    470             # Exclude tuple because it is seen as a single column name
    471             values = cast(Sequence[Label], values)
--> 472             indexed = data._constructor(
    473                 data[values]._values, index=index, columns=values
    474             )

~\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    495                 mgr = init_dict({data.name: data}, index, columns, dtype=dtype)
    496             else:
--> 497                 mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
    498 
    499         # For data is list-like, or Iterable (will consume into list)

~\anaconda3\lib\site-packages\pandas\core\internals\construction.py in init_ndarray(values, index, columns, dtype, copy)
    232         block_values = [values]
    233 
--> 234     return create_block_manager_from_blocks(block_values, [columns, index])
    235 
    236 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in create_block_manager_from_blocks(blocks, axes)
   1663                 ]
   1664 
-> 1665         mgr = BlockManager(blocks, axes)
   1666         mgr._consolidate_inplace()
   1667         return mgr

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in __init__(self, blocks, axes, do_integrity_check)
    147 
    148         if do_integrity_check:
--> 149             self._verify_integrity()
    150 
    151         # Populate known_consolidate, blknos, and blklocs lazily

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in _verify_integrity(self)
    326                 raise construction_error(tot_items, block.shape[1:], self.axes)
    327         if len(self.items) != tot_items:
--> 328             raise AssertionError(
    329                 "Number of manager items must equal union of "
    330                 f"block items\n# manager items: {len(self.items)}, # "

AssertionError: Number of manager items must equal union of block items
# manager items: 42, # tot_items: 44

这与我的数据预处理有关还是我的代码有问题?编码相对较新,如果我的问题的措辞不正确,请致歉。表格形状为 37679,44。

【问题讨论】:

  • 能否也提供回溯?
  • 抱歉,添加了回溯
  • hmm,如果你还没有,你能用pivot_table代替pivot吗?
  • pivot_table 返回InvalidIndexError: Reindexing only valid with uniquely valued Index objects

标签: python machine-learning pivot valueerror


【解决方案1】:

这可能是因为列之间存在重复。 可以使用data.columns.duplicated() 识别重复的列。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-01-12
    • 2018-10-15
    • 2020-05-21
    • 2021-10-31
    • 2021-12-05
    • 1970-01-01
    • 2019-01-27
    • 2013-11-09
    相关资源
    最近更新 更多