如何在数据框中编写递归？答案

【问题标题】：How to write recursion in dataframe?如何在数据框中编写递归？
【发布时间】：2020-04-02 11:02:53
【问题描述】：

我有这样的数据框：

    Price    Signal
0   28.68     -1
1   33.36      1
2   44.7      -1
3   43.38      1 ---- smaller than Price[2] # False: Drop row[3,4]
4   41.67     -1
5   42.17      1 ---- smaller than Price[2] # False: Drop row[5,6]
6   44.21     -1
7   46.34      1 ---- greater than Price[2] # True: Keep
8   45.2      -1 
9   43.4       1 ---- Still Keep because it is the last row

如果信号 1 的价格高于之前的价格，我的逻辑是保留该行。如果不是，它将丢弃它的行和下一行，因为信号必须散布在 -1 和 1 之间，并且还必须将下一个信号 1 与上面的最后一个信号进行比较（我在上面的数据帧快照中已经解释过）。

最后一个Signal 1仍然保留，虽然它不满足条件，因为规则是Signal列的最后一项必须是1

直到现在我的努力都在这里：

def filter_sell(df):
    # For export the result
    filtered_sell_df = pd.DataFrame()

    for i in range(0, len(df) + 1):
        if df.iloc[i]["Signal"] == 1:
            if df.iloc[i]["Price"] > df.iloc[i - 1]["Price"]:
                pass
            else:
                try:
                    df.drop([i, i + 1])
                    filter_sell(df)
                # Try to handle the i + 1 above since len(df) is changed
                except RecursionError:
                    break
        else:
            pass

我是写递归的新手，谢谢你的帮助！

【问题讨论】：

标签： python-3.x pandas recursion

【解决方案1】：

没有recursion 也可以。顺便说一句，您的方法会很慢，因为您在循环中调用.drop()。最简单的方法是使用新列来标记要删除的行。

df = pd.DataFrame({
    'Price': (28.68, 33.36, 44.7, 43.38, 41.67, 42.17, 44.21, 46.34, 45.2, 43.4),
    'Signal': (-1, 1, -1, 1, -1, 1, -1, 1, -1, 1),
})


# column with flag for deleting unnecessary records
df['max_price'] = 1
# default max_price in first row
max_price = df['Price'].loc[0]
index = 1
# because we do not check last record
stop_index = len(df.index) - 1

while index < stop_index:
    # just check max price because signal != 1
    if df['Signal'].loc[index] == -1:
        current = df['Price'].loc[index]
        if current > max_price:
            max_price = current
        index += 1
        continue

    current = df['Price'].loc[index]
    if max_price > current:
        # last max_price > current
        # set 'remove flag' to current and next row
        df['max_price'].loc[index] = 0
        df['max_price'].loc[index + 1] = 0
        # increase index to 2 because next row will be removed
        index += 2
        continue

    index += 1


# just drop records without max_price and drop column
df = df[df['max_price'] == 1]
df = df.drop(columns=['max_price'])
print(df)

希望这会有所帮助。

【讨论】：

谢谢@Danila Ganchar 您的解决方案效果很好，对我帮助很大！遗憾的是我仍然无法修复递归，因为它更容易理解！