熊猫遍历行并跳过行？答案

【问题标题】：Pandas looping through rows and skipping over rows?熊猫遍历行并跳过行？
【发布时间】：2018-08-11 23:23:49
【问题描述】：

我有一个 pandas 数据框，其中一列中包含价格，另一列中包含日期时间。我要做的就是创建一个回测循环，如果价格达到某个点，则跳过 30 行并计算该行与之后的第 30 行之间的价格差异。然后，继续循环到数据帧的最后一行。

有没有比只输入 continue 30 次更 Pythonic 的方式来做到这一点？

感谢帮助

样本df：

index                  vol1     vol2          vol3           price  
0            0.0    0.984928    0.842774    0.403604        0.24676   
1            0.0    0.984928    0.842774    0.403604        0.24676   
2            0.0    0.984928    0.842774    0.403604        0.24676   
3            0.0    0.984928    0.842774    0.403604        0.24676   
4            0.0    0.984928    0.842774    0.403604        0.24683   
5            0.0    0.958933    0.843822    0.407730        0.24724   
6            0.0    0.950355    0.842774    0.412017        0.24724   
7            0.0    0.946536    0.843822    0.419604        0.24725   
8            0.0    0.941535    0.843822    0.421247        0.24683   
9            0.0    0.935383    0.842775    0.415184        0.24708   
10           0.0    0.934629    0.842774    0.402836        0.24691

【问题讨论】：

你能发布你的代码和一些示例数据吗？

标签： python pandas loops

【解决方案1】：

我不确定您要完全跳过这 30 行之间的行还是要继续。我试图在逐行版本中给出你想要什么的想法。根据 Peter 的建议，需要示例数据和您的原型代码，以便更多人可以帮助您。

这是我的示例代码：

# load dataframe to df
df = pd.Dataframe()

# set threshold for the price
limit_price = x

# collect difference of prices as a list
diff_prices = []

# loop through dataframe df
for index, row in df.iterrows():
  # row is pd.Series now, check if it pass certain point here
  if row.prices > limit_price:
    # if pass find the diff of this row and 30 row ahead of it
    # and then add to a list
    diff_prices.append(row.prices - df.iloc[index+30].prices)
  else:
    # if not pass add 0 to a list instead
    diff_prices.append(0)

【讨论】：

感谢您的澄清 - 我添加了一些示例数据。当我尝试该代码时，它给了我'IndexError：单个位置索引器超出范围'，可能是因为它正在遍历整个集合并且最后变得混乱？我应该尝试一下然后继续吗？
当当前索引前面没有第 30 行时会发生这种情况。你需要决定这里会发生什么。它可能只是将 0 附加到列表中，或者附加一些不同的东西，例如NaN，以告知当前指数价格高于该阈值，但其前面没有第 30 个指数。在代码中，您可以执行一些简单的操作，例如将索引 (index+30) 与数据数量 (df.shape[0]) 进行比较。