循环遍历特定行值的数据帧行答案

【问题标题】：Loop through rows of dataframe at specific row values循环遍历特定行值的数据帧行
【发布时间】：2021-04-27 08:48:25
【问题描述】：

我的数据框包含每种处理的三个不同的复制。我想遍历两者，所以我想遍历每个处理，并为每个处理计算每个复制的模型。我设法遍历了治疗，但我还需要遍历每种治疗的重复。理想情况下，应将输出保存到包含“处理”和“复制”的新数据框中。有什么建议吗？

数据框（df）如下所示：

 treatment replication time  y
  **8          1          1   0.1**
  8          1          2   0.1 
  8          1          3   0.1
  **8          2          1   0.1**
  8          2          2   0.1 
  8          2          3   0.1
  **10         1          1   0.1**
  10         1          2   0.1 
  10         1          3   0.1
  **10         2          1   0.1**
  10         2          2   0.1 
  10         2          3   0.1

for i, g in df.groupby('treament'):
   k = g.iloc[0].y                                   
   popt, pcov = curve_fit(model, x, y)
   fit_m = popt

我现在应用了 iterrows，但是我不能再使用 NPQ [0] 的索引来获取初始值。知道如何解决这个问题吗？错误信息如下：

for index, row in HL.iterrows():
  g = (index, row['filename'], row['hr'], row['time'], row['NPQ'])
  k = g.iloc[0]['NPQ'])

AttributeError: 'tuple' 对象没有属性 'iloc'

提前谢谢你

【问题讨论】：

df.groupby(['treatment', 'replication'])
可以在不循环的情况下执行此操作，从而提高代码的时间效率。我们只需要知道你如何定义x 和y（curve_fit 的参数）
请记住这一点，一般来说，对于 pandas，尝试使用循环解决问题是不正确的实现。见How to iterate over rows in a DataFrame in Pandas & Fast, Flexible, Easy and Intuitive: How to Speed Up Your Pandas Projects
@Ralubrusto，我在数据框中定义了 x= 时间和 y= y。提前谢谢你
@TrentonMcKinney 请在问题中查看我的更新：我使用了 iterrows，但是我无法使之前的代码工作。有什么建议吗？谢谢！

标签： python pandas dataframe loops rows

【解决方案1】：

grouped_df = HL.groupby(["hr", "filename"])

for key, g in grouped_df:
   k = g.iloc[0].y                                   
   popt, pcov = curve_fit(model, x, y)
   fit_m = popt

【讨论】：