【问题标题】:Keeping track of dropped indices when dropping elements from numpy array从 numpy 数组中删除元素时跟踪删除的索引
【发布时间】:2015-07-30 16:42:30
【问题描述】:

我想从 numpy 数组 theoretical_price_for_bonds 中删除不满足特定条件的元素。我知道我可以使用下面的代码行来做到这一点。但是,我还想跟踪已删除元素的索引,我想知道如何才能做到这一点。

theoretical_price_for_bonds = theoretical_price_for_bonds[(theoretical_price_for_bonds>75)]

我尝试使用循环从 numpy 数组中动态删除元素。价格还可以,但dropped_indices 原来只是一个充满None 的列表:

#To insert values into a list dynamically
class GrowingList(list):
    def __setitem__(self, index, value):
        if index >= len(self):
            self.extend([None]*(index + 1 - len(self)))
        list.__setitem__(self, index, value)

count = 0
dropped_indices = GrowingList()
for x,value in np.ndenumerate(theoretical_price_for_bonds):
    count = count + 1         
    if count < theoretical_price_for_bonds.shape[0]:
        if theoretical_price_for_bonds[count] < 75:
            theoretical_price_for_bonds = np.delete(theoretical_price_for_bonds, (count), axis=0)
            dropped_indices[count] = count

谢谢

【问题讨论】:

    标签: python arrays numpy


    【解决方案1】:

    如果您想跟踪被删除元素的索引,只需保留用于索引数组的布尔掩码并使用np.where

    >>> x = np.array([2,8,3,4,7,6,1])
    >>> lix = x > 4
    >>> x = x[lix] # this "drops" everything 4 or less
    >>> x
    array([8, 7, 6])
    >>> [dropped] = np.where(~lix) # find the indices that weren't dropped
    >>> dropped
    array([0, 2, 3, 6])
    

    【讨论】:

      【解决方案2】:

      您也可以考虑使用pandas.Series,它有一个.index 属性,可用于跟踪丢弃的值:

      import numpy as np
      import pandas as pd
      
      s = pd.Series(np.array([2,8,3,4,7,6,1]))
      print(s.values, s.index)
      # (array([2, 8, 3, 4, 7, 6, 1]), Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64'))
      
      s2 = s[s > 4]
      print(s2.values, s2.index)
      # (array([8, 7, 6]), Int64Index([1, 4, 5], dtype='int64'))
      

      【讨论】: