【问题标题】:How to write recursion in dataframe?如何在数据框中编写递归?
【发布时间】:2020-04-02 11:02:53
【问题描述】:

我有这样的数据框:

    Price    Signal
0   28.68     -1
1   33.36      1
2   44.7      -1
3   43.38      1 ---- smaller than Price[2] # False: Drop row[3,4]
4   41.67     -1
5   42.17      1 ---- smaller than Price[2] # False: Drop row[5,6]
6   44.21     -1
7   46.34      1 ---- greater than Price[2] # True: Keep
8   45.2      -1 
9   43.4       1 ---- Still Keep because it is the last row

如果信号 1 的价格高于之前的价格,我的逻辑是保留该行。如果不是,它将丢弃它的行和下一行,因为信号必须散布在 -1 和 1 之间,并且还必须将下一个信号 1 与上面的最后一个信号进行比较(我在上面的数据帧快照中已经解释过)。

最后一个Signal 1仍然保留,虽然它不满足条件,因为规则是Signal列的最后一项必须是1

直到现在我的努力都在这里:

def filter_sell(df):
    # For export the result
    filtered_sell_df = pd.DataFrame()

    for i in range(0, len(df) + 1):
        if df.iloc[i]["Signal"] == 1:
            if df.iloc[i]["Price"] > df.iloc[i - 1]["Price"]:
                pass
            else:
                try:
                    df.drop([i, i + 1])
                    filter_sell(df)
                # Try to handle the i + 1 above since len(df) is changed
                except RecursionError:
                    break
        else:
            pass

我是写递归的新手,谢谢你的帮助!

【问题讨论】:

    标签: python-3.x pandas recursion


    【解决方案1】:

    没有recursion 也可以。顺便说一句,您的方法会很慢,因为您在循环中调用.drop()。最简单的方法是使用新列来标记要删除的行。

    df = pd.DataFrame({
        'Price': (28.68, 33.36, 44.7, 43.38, 41.67, 42.17, 44.21, 46.34, 45.2, 43.4),
        'Signal': (-1, 1, -1, 1, -1, 1, -1, 1, -1, 1),
    })
    
    
    # column with flag for deleting unnecessary records
    df['max_price'] = 1
    # default max_price in first row
    max_price = df['Price'].loc[0]
    index = 1
    # because we do not check last record
    stop_index = len(df.index) - 1
    
    while index < stop_index:
        # just check max price because signal != 1
        if df['Signal'].loc[index] == -1:
            current = df['Price'].loc[index]
            if current > max_price:
                max_price = current
            index += 1
            continue
    
        current = df['Price'].loc[index]
        if max_price > current:
            # last max_price > current
            # set 'remove flag' to current and next row
            df['max_price'].loc[index] = 0
            df['max_price'].loc[index + 1] = 0
            # increase index to 2 because next row will be removed
            index += 2
            continue
    
        index += 1
    
    
    # just drop records without max_price and drop column
    df = df[df['max_price'] == 1]
    df = df.drop(columns=['max_price'])
    print(df)
    

    希望这会有所帮助。

    【讨论】:

    • 谢谢@Danila Ganchar 您的解决方案效果很好,对我帮助很大!遗憾的是我仍然无法修复递归,因为它更容易理解!
    猜你喜欢
    • 1970-01-01
    • 2014-08-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-10-03
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多