【问题标题】:Pandas Int64 .loc cannot do slice indexing?Pandas Int64 .loc 不能做切片索引?
【发布时间】:2020-04-16 12:50:20
【问题描述】:

考虑这个简单的例子:

>>> import pandas as pd
>>>
dfA = pd.DataFrame({
  "key":[1,3,6,10,15,21],
  "columnA":[10,20,30,40,50,60],
  "columnB":[100,200,300,400,500,600],
  "columnC":[110,202,330,404,550,606],
})

>>> dfA
   key  columnA  columnB  columnC
0    1       10      100      110
1    3       20      200      202
2    6       30      300      330
3   10       40      400      404
4   15       50      500      550
5   21       60      600      606

如果我想在这里使用 .loc,它可以正常工作:

>>> dfA.set_index('key').loc[2:16]
     columnA  columnB  columnC
key
3         20      200      202
6         30      300      330
10        40      400      404
15        50      500      550

...但是如果我对 Int64 进行“强制转换”(.astype),它会失败:

>>> dfA.astype('Int64').set_index('key').loc[2:16]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexing.py", line 1768, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexing.py", line 1912, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexing.py", line 1796, in _get_slice_axis
    indexer = labels.slice_indexer(
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4712, in slice_indexer
    start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4925, in slice_locs
    start_slice = self.get_slice_bound(start, "left", kind)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4837, in get_slice_bound
    label = self._maybe_cast_slice_bound(label, side, kind)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4789, in _maybe_cast_slice_bound
    self._invalid_indexer("slice", label)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3075, in _invalid_indexer
    raise TypeError(
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [2] of <class 'int'>
>>>

为什么会发生这种情况 - 我也可以使用 Int64 进行这种 .loc 索引吗? (我必须使用 Int64,因为我读入了缺失值的 .csv 数据,并且我不希望将值转换为浮点数 - 但我仍然想在上述情况下使用 .loc)


编辑:更多信息:

>>> dfA.astype('Int64').loc(0)[0]['key']
1
>>> type(dfA.astype('Int64').loc(0)[0]['key'])
<class 'numpy.int64'>

好的,所以 dtype 'Int64' 的实际数字属于 'numpy.int64' 类 - 但在这种情况下仍然不能用于 .loc:

>>> import numpy as np
>>> dfA.astype('Int64').set_index('key').loc[np.int64(2):np.int64(2)]
...
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [2] of <class 'numpy.int64'>

【问题讨论】:

    标签: python python-3.x pandas dataframe


    【解决方案1】:

    您可以通过将key 设为索引首先然后转换为Int64 来规避此问题:

    dfA.set_index('key').astype('Int64').loc[2:16]
    
         columnA  columnB  columnC
    key                           
    3         20      200      202
    6         30      300      330
    10        40      400      404
    15        50      500      550
    

    或者仅将您的 key 列转换为老式的 int64

    df.index = df['key'].astype('int64')
    

    也就是说,假设它不像您的其他列那样具有 &lt;NA&gt; 值。

    【讨论】:

    • 为什么我无法理解。这可能不是预期的行为。
    猜你喜欢
    • 2016-10-09
    • 2015-03-16
    • 2018-12-29
    • 2021-08-25
    • 2018-08-20
    • 2015-08-21
    • 1970-01-01
    • 2019-04-17
    • 1970-01-01
    相关资源
    最近更新 更多