【发布时间】:2015-08-07 08:40:35
【问题描述】:
如果我有一个 CSV 文件,其中每一行都有一个字典值(列是 ["Location"]、["MovieDate"]、["Formatted_Address"]、["Lat"]、["Lng"] ),如果我想按Location 分组并附加到共享相同Location 值的所有MovieDate 值上,我被告知使用OrderDict。
数据前:
Location,MovieDate,Formatted_Address,Lat,Lng
"Edgebrook Park, Chicago ",Jun-7 A League of Their Own,"Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
"Edgebrook Park, Chicago ","Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
对于具有相同位置的每一行(如本例中的 ^),我想进行这样的输出,以便没有重复的位置。
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
我的代码使用ordereddict 执行此操作有什么问题?
from collections import OrderedDict
od = OrderedDict()
import csv
with open("MovieDictFormatted.csv") as f,open("MoviesCombined.csv" ,"w") as out:
r = csv.reader(f)
wr = csv.writer(out)
header = next(r)
for row in r:
loc,rest = row[0], row[1]
od.setdefault(loc, []).append(rest)
wr.writerow(header)
for loc,vals in od.items():
wr.writerow([loc]+vals)
我最终得到的是这样的:
['Edgebrook Park, Chicago ', 'Jun-7 A League of Their Own']
['Gage Park, Chicago ', "Jun-9 It's a Mad, Mad, Mad, Mad World"]
['Jefferson Memorial Park, Chicago ', 'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers ']
['Commercial Club Playground, Chicago ', 'Jun-12 Despicable Me 2']
问题是在这种情况下我没有让其他列显示,我该怎么做才能最好?我还希望将 MovieDate 值设置为一个长字符串,如下所示:
'Jun-12 Monsters University Jul-11 Frozen Aug-8 The Blues Brothers '
而不是:
'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers '
谢谢各位,不胜感激。我是python菜鸟。
不幸的是,将 row[0], row[1] 更改为 row[0], row[1:] 并不能满足我的需求。我只想在第二列 (MovieDate) 中添加值,而不是像这样复制所有其他列:
['Jefferson Memorial Park, Chicago ', ['Jun-12 Monsters University ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Jul-11 Frozen ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Aug-8 The Blues Brothers ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353']]
【问题讨论】:
-
具体出了什么问题?你得到不正确的输出吗?您收到错误消息吗?我们需要更多细节。
-
嘿@user2357112,我更新了它-抱歉问题不完整。
-
rest应该是整行的其余部分吗?因为row[1]只是第二列的东西。 -
是的,这是一个误导性的标题,我会更改。 row[1] 是正确的,也是我们唯一要附加的内容。
-
如果您只存储了
row[0]和row[1],为什么您希望其他列中的任何数据都显示在输出中?
标签: python dictionary ordereddictionary