【发布时间】:2021-05-28 10:57:40
【问题描述】:
目标:
我想根据唯一编号和 +/-7 天内的日期匹配合并两个数据框
数据:
df1
Number Report DateDone
1 some words 13/1/2021
1 more stuff 21/8/2021
44 balbla 11/4/2020
2 gobbledy bla 01/03/2019
44 rara rasputin 13/10/2021
44 tree frogs 11/10/2010
df2
Number Report DateDone
1 hocum poklum 11/1/2021
1 mjimmeny cricket 21/8/2021
44 it wasnt me 11/2/2020
2 its not really 6/03/2019
44 im innocent 12/10/2021
44 bullfrogs 11/01/2010
预期结果
Number.df1 Report.df1 DateDone.df1 Number.df2 Report.df2 DateDone.df2
1 some words 13/1/2021 1 hocum poklum 11/1/2021
1 more stuff 21/8/2021 1 jimmeny cricket 21/8/2021
2 gobbledy bla 01/03/2019 2 its not really 6/03/2019
44 rara rasputin 13/10/2021 44 im innocent 12/10/2021
我打算使用类似于我找到的here 的 sql 合并,但我很难知道如何合并数字和日期范围。我是否需要计算 df1 中 DateDone 前后的 7 天?肯定有比必须先计算两个新列更有效的方法吗?
qry = '''
select
df1.DateDone_start TermStart,
df1.DateDone_end TermEnd,
df2.DateDone df2Start,
df1.Number,
df2.Number
from
df1 join df2 on
date between df1.DateDone_start and df1.DateDone_end join df1 on
df1.Number = df2.Number
'''
df = pd.read_sql_query(qry, conn)
【问题讨论】: