熊猫：在更大的熊猫数据框中替换和附加整个列表答案

【问题标题】：Pandas: Replace and append entire list in a bigger pandas dataframe熊猫：在更大的熊猫数据框中替换和附加整个列表
【发布时间】：2021-09-29 12:45:07
【问题描述】：

我有一个看起来像这样的熊猫数据框

URLs          | Feature1 | Feature2  | Feature3
yahoo.com     | nan      | nan       | nan 
economist.com | nan      | nan       | nan 
facebook.com  | nan      | nan       | nan

我正在通过一个用户定义的函数计算特征，该函数将 URL 逐个分析并以列表格式返回特征。

我遍历数据集以计算特征。第一次迭代返回，比如 yahoo 的 [1,2,3]；经济学家为 [9,10,0]，脸书为 [0,8,10]。

我的问题是，在循环中，如何将列表附加到数据框，以便每次迭代看起来像这样

URLs          | Feature1 | Feature2  | Feature3
yahoo.com     | 1        | 2         | 3 
economist.com | nan      | nan       | nan 
facebook.com  | nan      | nan       | nan

然后进行下一次迭代

URLs          | Feature1 | Feature2  | Feature3
yahoo.com     | 1        | 2         | 3 
economist.com | 9        | 10        | 0 
facebook.com  | nan      | nan       | nan

最后，

URLs          | Feature1 | Feature2  | Feature3
yahoo.com     | 1        | 2         | 3 
economist.com | 9        | 10        | 0 
facebook.com  | 0        | 8         | 10

这是我要存储的答案。

list.append 或 pandas.replace 的排列并没有引导我到任何地方。我找不到人们在做这样的事情的问题。或者，也许我不知道如何描述我的搜索。任何帮助都非常感谢。

【问题讨论】：

您可以将这些值附加到列表中，然后将其制作成数据框，然后再分配回
@AnuragDabas 所以没有直接的剥离和粘贴之类的东西，对吧？我将不得不走很长的路？
你可以在 for 循环中使用 .loc 访问器来分配它是直接的方式

标签： python pandas list dataframe replace

【解决方案1】：

您可以使用 .apply 对每一行运行函数并将值返回到相关列

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([["yahoo.com", None, None, None], ["economist.com", None, None, None], ["facebook.com", None, None, None]], colum
   ...: ns=["URLs", "Feature1", "Feature2", "Feature3"])

In [3]: df
Out[3]:
            URLs Feature1 Feature2 Feature3
0      yahoo.com     None     None     None
1  economist.com     None     None     None
2   facebook.com     None     None     None

In [4]: def some_func(url):
   ...:     if url == "yahoo.com":
   ...:         return 1, 2, 3
   ...:     if url == "economist.com":
   ...:         return 9, 10, 0
   ...:     return 0, 8, 10
   ...:

In [5]: df[['Feature1', 'Feature2', 'Feature3']] = df.apply(lambda row: some_func(row['URLs']), axis=1, result_type='expand')

In [6]: df
Out[6]:
            URLs  Feature1  Feature2  Feature3
0      yahoo.com         1         2         3
1  economist.com         9        10         0
2   facebook.com         0         8        10

【讨论】：