【发布时间】:2023-04-11 10:51:01
【问题描述】:
我有两个文件
输入.csv
11/13/2020 07:41:09 TREE count1: id1 green001
11/13/2020 07:43:09 TREE count1: id1 black001
11/13/2020 07:45:09 TREE count1: id2 black001
11/13/2020 07:45:09 PLAN count1: id3 green002
11/13/2020 07:45:09 PLAN count1: id4 green004
lookup.csv
ID,item,message
id1,item1,message 1
id2,item2,message 2
id3,item3,message 3
我正在尝试合并这两个文件并预期低于输出 预期输出:
Time,Type,counts,id,item,message,colour
11/13/2020 07:41:09,TREE,count1,id1,item1,message 1,green001
11/13/2020 07:43:09,TREE,count1,id1,item1,message 1,black001
11/13/2020 07:45:09,TREE,count1,id2,item2,message 2,black001
11/13/2020 07:45:09,PLAN,count1,id3,item3,message 3,green002
11/13/2020 07:45:19,PLAN,count1,id4, , ,green004
当查找文件中存在 ID 值时,我能够实现合并。 代码:
import pandas as pd
# read input and remove spurious : at end of count
input = pd.read_csv("input.csv", sep=' ',
names=["date","time", "tree","count","ID", "info"])
input["count"] = input["count"].apply(lambda s:s[:-1])
# read lookup and merge
lookup = pd.read_csv("lookup.csv")
merged = input.merge(lookup, on="ID")
# collapse time and date to single column
merged["time"] = merged["date"] + " " + merged["time"]
del merged["date"]
# output
print(merged)
merged.to_csv("testme.csv", index=False)
如果 input.csv 中的所有 ID 值都存在于 lookup.csv 文件中,则代码可以正常工作,但当 ID 值不存在于 lookup.csv 文件中时代码会失败
任何建议都会有所帮助。
【问题讨论】:
标签: python pandas python-2.7 dataframe merge