Python Pandas 更新/替换答案

【问题标题】：Python Pandas Update / ReplacePython Pandas 更新/替换
【发布时间】：2015-11-10 17:29:56
【问题描述】：

努力掌握更新/合并/连接等以实现我认为的简单合并或更新过程。我有 df（记录值）和 df2（0s 的时间序列），并想用 df 的相应记录值更新/替换 df2 中的值，匹配 date_time。

示例：df=

Date_Time   Perc
03/03/2010 00:05    1.0
03/03/2010 00:15    2.0

df2 =

  Date_Time Perc
03/03/2010 00:00    0.0
03/03/2010 00:05    0.0
03/03/2010 00:10    0.0
03/03/2010 00:15    0.0
03/03/2010 00:20    0.0

结果将返回的位置：

Date_Time   Perc
03/03/2010 00:00    0.0
03/03/2010 00:05    1.0
03/03/2010 00:10    0.0
03/03/2010 00:15    2.0
03/03/2010 00:20    0.0

我觉得这很令人沮丧，因为http://pandas.pydata.org/pandas-docs/stable/merging.html 上有很好的信息和示例，How to update values in a specific row in a Python Pandas DataFrame? 上有一个很好的 SO 问题，有多种解决方案，但尝试了多种方法，但都没有奏效。

到目前为止尝试过：使用变体重新索引 Date_time 和 df2.update(df)、多个合并/连接/连接变体、使用 apply 的改编定义（如下）......现在想知道我是否需要使用 iterrows （见下文？）。任何关于正确方向指针的建议将不胜感激......也许我在我的方法中遗漏了一些基本的东西......

def update_vals(row, data=data):
    if row.Date_Time == df.Date_Time:
        row.Perc = df.Perc
    return row

for index, row in df2['Date_Time'].iterrows():
    x = df2['Date_Time']
    for index, row in df['Date_Time'].iterrows():
        x2 = df['Date_Time']
        if x2 ==x:
            df2['Perc'] = df['Perc']

认为这也可以，但它会导致（ValueError: cannot reindex from a duplicate axis）

df.set_index('Date_Time', inplace=True)
df2.set_index('Date_Time', inplace=True)
df2.update(df[['Perc']])

【问题讨论】：

df2.update(df) 为我工作。
@hellpanderrr 这很奇怪，当我运行它时，我的输出丢失了完整的 df2（见下文 00:00 时间戳已消失）我想保留完整的 df2，并且只更改匹配时间“perc”的值：Date_Time Perc 0 03/03/2010 00:05 1.0 1 03/03/2010 00:15 2.0 2 03/03/2010 00:10 0.0 3 03/03/2010 00:15 0.0跨度>

标签： python pandas

【解决方案1】：

df2

                  Perc
Date_Time   
03/03/2010 00:00    0
03/03/2010 00:05    0
03/03/2010 00:10    0
03/03/2010 00:15    0
03/03/2010 00:20    0

df

                 Perc
Date_Time   
03/03/2010 00:05    1
03/03/2010 00:15    2

df2.update(df)

                 Perc
Date_Time   
03/03/2010 00:00    0
03/03/2010 00:05    1
03/03/2010 00:10    0
03/03/2010 00:15    2
03/03/2010 00:20    0

【讨论】：

非常感谢您的帮助...不得不承认，发现您是对的，并且在我超过 500k 行的实际数据集中，有一个重复的 Date_time，它抛出了更新过程...但是，学到了更多，也许这会对其他人有所帮助...非常感谢..