多个 for 循环以创建 pandas 数据框答案

【问题标题】：Multiple for loops to create a pandas dataframe多个 for 循环以创建 pandas 数据框
【发布时间】：2018-10-14 13:35:28
【问题描述】：

我正在尝试创建一个如下所示的 pandas 数据框：

          -5      0      5
index                     
-5       NaN  slope  slope
 0     slope    NaN  slope
 5     slope  slope    NaN

但我能得到的最接近的是下面的代码，它返回一个只有一列的数据帧（这是通过 ctr1 循环的最后一次迭代的列表）

weather = np.linspace(-5, 5, 3)

for ctr1 in weather:
    slope_list = []
    X1 = round(ctr1,1)
    for ctr2 in weather:
        X2 = round(ctr2,1)

        Y1 = regressor[0] * X1**3 + \
        regressor[1] * X1**2 + \
        regressor[2] * X1 + \
        regressor[3] 

        Y2 = regressor[0] * X2**3 + \
        regressor[1] * X2**2 + \
        regressor[2] * X2 + \
        regressor[3]

        slope = (Y2-Y1)/(X2-X1)
        slope_list.append(slope)

    df_final = pd.DataFrame({X1:slope_list})

谁能帮忙？

【问题讨论】：

请提供预期的输出。
嗨，andrew_reece，预期的输出就像我在顶部显示的那样，每个“斜率”都是一个数字（基于使用 X1、X2、Y1、Y2 的计算）。

标签： python pandas for-loop

【解决方案1】：

df_final 仅获得 3 个元素，因为它与 for ctr2 in weather 处于相同的缩进级别，因此每个外部循环都会重新分配它。虽然，如果你解决了这个问题，你会得到一个只有一个长列的数据框：你只有一个 slope_list 被附加到最后变成一个数据框。

这是我在不改变你的分配方法的情况下解决这个问题的方法：

weather = np.linspace(-5, 5, 3)
slope_list = []
for ctr1 in weather:
X1 = round(ctr1,1)
for ctr2 in weather:
    X2 = round(ctr2,1)

    Y1 = regressor[0] * X1**3 + \
    regressor[1] * X1**2 + \
    regressor[2] * X1 + \
    regressor[3] 

    Y2 = regressor[0] * X2**3 + \
    regressor[1] * X2**2 + \
    regressor[2] * X2 + \
    regressor[3]

    slope = (Y2-Y1)/(X2-X1)
    slope_list.append(slope)


#make it 3 columns and 3 rows as intended
slope_list = np.array(slope_list).reshape(3, 3)
#make the dataframe
df_final = pd.DataFrame({X1:slope_list})
#manually add the desired row and column indexes
df_final = df.set_index(weather)
df_final.columns = weather

尽管您应该记住，除非您确切知道自己在做什么，否则在使用 pandas 时创建循环和嵌套循环通常意味着您错过了一种更简单、更好的处理方式。

【讨论】：

【解决方案2】：

您可以尝试直接在 DataFrame 中赋值。只需使用 index=weather 创建空 DataFrame：

import numpy as np
weather = np.linspace(-5, 5, 3)
df_final = pd.DataFrame([], index=weather)
for ctr1 in weather:
    X1 = round(ctr1,1)
    for ctr2 in weather:
        X2 = round(ctr2,1)

        Y1 = regressor[0] * X1**3 + \
        regressor[1] * X1**2 + \
        regressor[2] * X1 + \
        regressor[3] 

        Y2 = regressor[0] * X2**3 + \
        regressor[1] * X2**2 + \
        regressor[2] * X2 + \
        regressor[3]

       slope = (Y2-Y1)/(X2-X1)

       df_final.loc[X1, X2] = np.NaN if X1 == X2 else slope

【讨论】：

对于整数位置设置，我建议您使用更高效的.iat 而不是.loc。
您好 Alexey，感谢您的解决方案。为了完整起见，我想我会提供我需要添加的唯一更新，以使其完美地适用于更大的天气阵列。我主要根据您的代码添加了自己的解决方案。

【解决方案3】：

slope_list = [] 在每次迭代时重置结果列表，因此只保留最后一个。您需要在外循环之外定义结果列表，并将子结果附加到它。

【讨论】：

感谢 Kosist，我在外循环内重置 slope_list 的原因是我只希望每列有 3 行。如果我将它设置在外循环之外，我最终会得到一个包含 9 个值的列表。
您能否也发布regressor 数据（比完整代码更好），以便我们可以尝试运行代码？ B/c 现在你通过这个重置数据，这就是为什么最后一次 ctr1 迭代只有一列。

【解决方案4】：

如前所述，为了完整起见，我已经发布了我的问题的答案，该答案适用于更大的天气数组。唯一的区别是我在代码前面做了四舍五入：

weather = np.round(np.linspace(-5, 35, 401), decimals = 1)
df_final = pd.DataFrame([], index=weather)
for ctr1 in weather:
    X1 = ctr1
    for ctr2 in weather:
        X2 = ctr2

        Y1 = regressor[0] * X1**3 + \
        regressor[1] * X1**2 + \
        regressor[2] * X1 + \
        regressor[3] 

        Y2 = regressor[0] * X2**3 + \
        regressor[1] * X2**2 + \
        regressor[2] * X2 + \
        regressor[3]

        slope = (Y2-Y1)/(X2-X1)

        df_final.loc[X1, X2] = np.NaN if X1 == X2 else slope

【讨论】：