在 pandas 1.0.1 上从 1.1.5 加载数据帧失败答案

【问题标题】：Loading dataframe from 1.1.5 fails on pandas 1.0.1在 pandas 1.0.1 上从 1.1.5 加载数据帧失败
【发布时间】：2022-01-05 12:55:06
【问题描述】：

我有一个数据框保存到一个泡菜（还有一堆其他的东西，作为字典）。使用 pandas 1.1.5 版本时保存。

我正在尝试使用 1.0.1 版本打开它，但出现以下错误

File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 5272, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 5272, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 5272, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  [Previous line repeated 493 more times]
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 493, in _info_axis
    return getattr(self, self._info_axis_name)
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
  File "pandas/_libs/properties.pyx", line 63, in pandas._libs.properties.AxisProperty.__get__
  File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
RecursionError: maximum recursion depth exceeded while calling a Python object

有没有办法克服这个错误？我可以再次保存数据帧，但无法在两台计算机上降级或升级版本。

谢谢

【问题讨论】：

您可以通过 numpy 尝试解决方法，例如np.save(df.values, allow_pickle=True)。您仍然需要以某种方式从索引和列传输数据。

标签： python pandas pickle version-compatibility pandas-1.0

【解决方案1】：

我无法复制您的确切错误，但当我尝试使用 pandas v1.0.1 读取由 pandas v1.1.5 创建的 pickle 文件时，我又遇到了另一个错误。我能够通过使用羽毛格式保存文件来解决这个问题。示例代码：-

In [23]: cake # in version v1.1.5
Out[23]: 
     replicate recipe  temperature  angle  temp
1            1      A          175     42   175
2            1      A          185     46   185
3            1      A          195     47   195
4            1      A          205     39   205
5            1      A          215     53   215
..         ...    ...          ...    ...   ...
266         15      C          185     28   185
267         15      C          195     25   195
268         15      C          205     25   205
269         15      C          215     31   215
270         15      C          225     25   225

In [24]: cake.reset_index().to_feather("cake.feather")

在 v1.0.1 中读取文件：

In [15]: cake = pd.read_feather("cake.feather")
In [16]: cake
Out[16]: 
     index  replicate recipe  temperature  angle  temp
0        1          1      A          175     42   175
1        2          1      A          185     46   185
2        3          1      A          195     47   195
3        4          1      A          205     39   205
4        5          1      A          215     53   215
..     ...        ...    ...          ...    ...   ...
265    266         15      C          185     28   185
266    267         15      C          195     25   195
267    268         15      C          205     25   205
268    269         15      C          215     31   215
269    270         15      C          225     25   225

In [17]: pd.__version__
Out[17]: '1.0.1'

In [18]: cake.set_index('index') # To set the index

这种方法的缺点是额外依赖 pyarrow 并保存为羽化二进制格式 requires that you reset the index，就像我在上面所做的那样。

【讨论】：