【发布时间】:2018-09-17 16:21:24
【问题描述】:
我收到来自 Pandas 的性能警告
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py:1471:
PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->['int', 'str']]
我在 github 上阅读了几期和这里的问题,他们都说这是因为我在一列中混合了类型,但我绝对不是。简单的例子如下:
import pandas as pd
df = pd.DataFrame(columns=['int', 'str'])
df = df.append({ 'int': 0, 'str': '0'}, ignore_index=True)
df = df.append({ 'int': 1, 'str': '1'}, ignore_index=True)
for _, row in df.iterrows():
print(type(row['int']), type(row['str']))
# <class 'int'> <class 'str'>
# <class 'int'> <class 'str'>
# however
df.dtypes
# int object
# str object
# dtype: object
# the following causes the warning
df.to_hdf('table.h5', 'table')
这是怎么回事?我该怎么办?
【问题讨论】:
-
您是否尝试过将数字列转换为数字?例如,
df[col] = df[col].astype(int)? -
@jpp 哇!而已!谢谢!我只是有点缺乏经验,不知道它是必需的还是一个把戏?
-
见下面的答案。
标签: python python-3.x pandas hdf5 pytables