【问题标题】:iterate over a list, apply a function and set each output to rows of a pandas dataframe遍历一个列表,应用一个函数并将每个输出设置为 pandas 数据帧的行
【发布时间】:2020-09-11 04:33:52
【问题描述】:

所以我有以下数据,它们来自两个不同的熊猫数据框:

lis = [] 

for index, rows in full.iterrows(): 
    my_list = [rows.ARIEL, rows.LBHD, rows.LFHD, rows.RFHD, rows.RBHD]    
    lis.append(my_list) 

lis2 = []

for index, rows in reduced.iterrows(): 
    my_list = rows.bar_head
    lis2.append(my_list) 

例如lislis的一部分如下所示:

lis = [[[-205.981, 1638.787, 1145.274], [-264.941, 1482.371, 1168.693], [-263.454, 1579.4370000000001, 1016.279], [-148.062, 1592.005, 1016.75], [-134.313, 1479.1429999999998, 1167.109]], ...

lis2 = [[-203.3502, 1554.3486, 1102.821], [-203.428, 1554.3492, 1103.0592], [-203.4954, 1554.3234, 1103.2794], [-203.5022, 1554.2974, 1103.4522], ...

我想要的是将lislis2 与以下应用方法一起使用(其中mdf 是另一个与其他两个长度相同的空数据帧,md 是我创建的函数):

 mdf['head_md'] = mdf['head_md'].apply(md, args=(5, lis, lis2))

但它现在的做法是,它向mdf的所有行输出相同的结果。

我想要的是它循环遍历lislis2并根据索引,将相应的结果输出到mdf的相应行。所有数据帧和变量的长度均为 7446。

我试过这个,但它不起作用:

for i in range(len(mdf)):
    for j in range(0, 5):
        mdf['head_md'] = mdf['head_md'].apply(md, args=(5, lis[i][j], lis2[i]))

如果您需要代码中的更多信息,请告诉我,提前致谢!

编辑:数据框示例:

bar_head
0   [-203.3502, 1554.3486, 1102.821]
1   [-203.428, 1554.3492, 1103.0592]
2   [-203.4954, 1554.3234, 1103.2794]
3   [-203.5022, 1554.2974, 1103.4522]
4   [-203.5014, 1554.2948, 1103.6594]

  ARIEL   LBHD    LFHD    RBHD    RFHD
0   [-205.981, 1638.787, 1145.274]  [-264.941, 1482.371, 1168.693]  [-263.454, 1579.4370000000001, 1016.279]    [-134.313, 1479.1429999999998, 1167.109]    [-148.062, 1592.005, 1016.75]
1   [-206.203, 1638.649, 1145.734]  [-264.85400000000004, 1482.069, 1168.776]   [-263.587, 1579.6129999999998, 1016.627]    [-134.286, 1479.0839999999998, 1167.076]    [-148.21, 1592.3310000000001, 1017.0830000000001]
2   [-206.37599999999998, 1638.531, 1146.135]   [-264.803, 1481.8210000000001, 1168.8519999999...   [-263.695, 1579.711, 1016.922]  [-134.265, 1478.981, 1167.104]  [-148.338, 1592.5729999999999, 1017.3839999999...
3   [-206.493, 1638.405, 1146.519]  [-264.703, 1481.5439999999999, 1168.95] [-263.742, 1579.8139999999999, 1017.207]    [-134.15200000000002, 1478.922, 1167.112]   [-148.421, 1592.8020000000001, 1017.4730000000...
4   [-206.56900000000002, 1638.33, 1146.828]    [-264.606, 1481.271, 1169.0330000000001]    [-263.788, 1579.934, 1017.467]  [-134.036, 1478.888, 1167.289]  [-148.50799999999998, 1593.0510000000002, 1017...

【问题讨论】:

  • 如果您想合并 lis 和 lis2,可以告诉我吗?请提及您的结果中的一行,即 mdf。
  • 我不想合并 lis 和 lis2。函数md 所做的是获取 lis[0][0] 和 lis2[0] 等,将它们逐个元素相减,求平方和并返回结果。所以对于mdf 的每一行,我想要一个数字作为输出。
  • 您在reduced.bar_head,full.ARIEL,reduced.bar_head,full.LBHD,reduced.bar_head,full.LFHD,reduced.bar_head, full.RFHD,reduced.bar_head, full.RBHD 上按行操作?每列中的值是三个浮点数的列表?您应该提供 DataFrames 的示例 - df[columns].head() - 和您的函数。
  • @wwii 我编辑了问题以包含此信息
  • 每一列是列表还是ndarray?

标签: python pandas loops apply


【解决方案1】:

如果fullreduced 的列中的项目是列表,首先将它们转换为numpy ndarrays。

ariel = np.array(full.ARIEL.to_list())
lbhd = np.array(full.LBHD.to_list())
lfhd = np.array(full.LFHD.to_list())
rfhd = np.array(full.RFHD.to_list())
rbhd = np.array(full.RBHD.to_list())

barhead = np.array(reduced.bar_head.to_list())

使用broadcastingariel 中减去barhead,将结果平方并沿最后一个轴求和(假设我理解关于您的函数的评论)。

a = np.sum(np.square(ariel-barhead[:,None,:]),-1)

使用下面的设置,结果是一个 (4,5) 值数组(四舍五入到两个位置)。

>>> a #  a[0]     a[1]     a[2]     a[3]     a[4]
array([[8939.02, 8956.22, 8971.93, 8984.87, 8999.85],  # b[0]
       [8918.35, 8935.3 , 8950.79, 8963.53, 8978.35],  # b[1]
       [8903.82, 8920.53, 8935.82, 8948.36, 8963.04],  # b[2]
       [8893.7 , 8910.24, 8925.38, 8937.78, 8952.34]]) # b[3]

您似乎想要一个一维序列的结果:a.ravel() 生成一个一维数组,如:

[(a[0]:b[0]),(a[1]:b[0]),(a[2]:b[0]),...,(a[0]:b[1]),(a[1]:b[1]),...,(a[0]:b[2]),...]

full的其他四列。

lb = np.sum(np.square(lbhd-barhead[:,None,:]),-1)
lf = np.sum(np.square(lfhd-barhead[:,None,:]),-1)
rf = np.sum(np.square(rfhd-barhead[:,None,:]),-1)
rb = np.sum(np.square(rbhd-barhead[:,None,:]),-1)

再次假设我理解您的流程,结果将是 100 个值(使用下面的设置)。

     full          reduced
(rows * columns) *  (rows)

x = np.concatenate([a.ravel(),lb.ravel(),lf.ravel(),rf.ravel(),rb.ravel()])

设置

import numpy as np
import pandas as pd

lis = [[[-205.981, 1638.787, 1145.274],[-264.941, 1482.371, 1168.693],[-263.454, 1579.437, 1016.279],[-134.313, 1479.1429, 1167.109],[-148.062, 1592.005, 1016.75]],
       [[-206.203, 1638.649, 1145.734],[-264.854, 1482.069, 1168.776],[-263.587, 1579.6129, 1016.627],[-134.286, 1479.0839, 1167.076],[-148.21, 1592.331, 1017.083]],
       [[-206.3759, 1638.531, 1146.135],[-264.803, 1481.821, 1168.85199],[-263.695, 1579.711, 1016.922],[-134.265, 1478.981, 1167.104],[-148.338, 1592.5729, 1017.3839]],
       [[-206.493, 1638.405, 1146.519],[-264.703, 1481.5439, 1168.95],[-263.742, 1579.8139, 1017.207],[-134.152, 1478.922, 1167.112],[-148.421, 1592.802, 1017.473]],
       [[-206.569, 1638.33, 1146.828],[-264.606, 1481.271, 1169.033],[-263.788, 1579.934, 1017.467],[-134.036, 1478.888, 1167.289],[-148.5079, 1593.051, 1017.666]]]


barhd = [[[-203.3502, 1554.3486, 1102.821]],
        [[-203.428, 1554.3492, 1103.0592]],
        [[-203.4954, 1554.3234, 1103.2794]],
        [[-203.5022, 1554.2974, 1103.4522]]]

full = pd.DataFrame(lis, columns=['ARIEL', 'LBHD', 'LFHD', 'RFHD', 'RBHD'])
reduced = pd.DataFrame(barhd,columns=['bar_head'])

【讨论】:

    【解决方案2】:

    希望能理解你,是你想要的吗? v 是 lis,v2 是 lis2。

    3D乘2D的算术函数。

    import numpy as np
    na = np.array
    
    v=na([[[1, 2, 3], [4, 5, 6]], [[7,  8,  9],[ 10,  11, 1]]])
    
    v2=na([[1, 2, 3], [4, 5, 6], [7,  8,  9],[ 10,  11, 12]])
    
    lst = []
    for a in v:
        for b in a:
            for a2 in v2:
                lst.append(b+a2) # you can do any arithmetic functions
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-06-06
      • 1970-01-01
      • 2019-08-24
      • 1970-01-01
      • 1970-01-01
      • 2018-12-17
      • 2021-12-11
      • 2011-06-13
      相关资源
      最近更新 更多