【问题标题】:Elements getting double added to dictionary元素被双重添加到字典中
【发布时间】:2017-07-14 14:24:21
【问题描述】:

我正在尝试编写以这种格式获取数据的代码

数据示例:

[['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH': 7.27}],
['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}],
['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}],
['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}],
['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}],
['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}], 
['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}],
['12319825', '40686', {'pH': 8.17}], 
['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}],
['12319825', '40809', {'pH': 8.42}], 
['12319825', '40827', {'pH': 8.46}], 
['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}],
['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}],
['12319825', '41080', {'pH': 8.4}], 
['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['12319825', '41134', {'pH': 8.67}]]

并将返回一个字典,其中键是污染物(在本例中为 pH 值或大肠杆菌),值就是我所说的 DateList。日期列表将是每个数据点的列表元组,格式为(日期,T/F)。如果值超出给定范围或超过给定值(取决于条件的类型),布尔值将为真

rangeCriteria={'pH':(5.0,9.0)}
convCriteria={'Echerichia coli':320)

现在,当我运行这段代码时,每个字典都有

def testLocationForConv(DataFromLocation): 
#checks if a pollutant is outside of acceptable values. 
#A dictionary is created where each pollutant has a cooresponding list of tuples
#with the date and a corresponding boolean to say if it is in or out of
#the criteria (true if out false if in) 
#It handles when the criteria is a minimum or range rather than a
#maximum

dateList=[]
impairedList=[]
overDict=dict()
for date in DataFromLocation:
    for pollutant in date[2]:
        if pollutant in conventionalCriteriaList: 
            dateList.append((date[1],date[2][pollutant]>convCriteria[pollutant]))
            overDict[pollutant]=dateList
        if pollutant in rangeCriteria:
            overDict[pollutant]=dateList
            dateList.append((date[1], (not (float(date[2][pollutant])>rangeCriteria[pollutant][0] and float(date[2][pollutant])<rangeCriteria[pollutant][1])) ))
        #if pollutant in minCriteriaList:
         #   overDict[pollutant]=dateList
          #  dateList.append((date[1],date[2][pollutant]<minCriteria[pollutant])

        else:
           pass  
print overDict

现在,将两种污染物的数据点添加到字典中,得到以下结果。

{'pH': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False),
('39972', False), ('39987', False), ('39987', False), ('40014', False),
('40329', False), ('40658', False), ('40658', False), ('40686', False),
('40714', False), ('40732', False), ('40809', False), ('40827', False),
('41043', False), ('41043', False), ('41071', False), ('41071', False),
('41080', False), ('41101', False), ('41101', True), ('41134', False)], 
'Escherichia coli': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False),
('39972', False), ('39987', False), ('39987', False), ('40014', False),
('40329', False), ('40658', False), ('40658', False), ('40686', False),
('40714', False), ('40732', False), ('40809', False), ('40827', False),
('41043', False), ('41043', False), ('41071', False), ('41071', False),
('41080', False), ('41101', False), ('41101', True), ('41134', False)]}

现在我输入了这个问题,我意识到问题在于我正在迭代日期,然后是污染物,但我想要一个编译日期的列表,但对污染物是分开的。我将如何制作这样一个列表并将其添加到字典中?

【问题讨论】:

  • 在重读您的帖子两次后,我大致了解了您的要求,但这会简单得多,如果您只是发布一个示例,我不会那么伤脑筋你想要什么输出。您也没有发布完整的代码——例如,conventionalCriteriaList 是什么?
  • 那么,列表中的第一项总是被丢弃?
  • 另外,每次都做overDict[pollutant]=dateList 是没有意义的......这是完全相同的列表。这就是为什么您的字典中的值完全相同...

标签: python list loops dictionary boolean


【解决方案1】:

我会退后一步,考虑一下您的方法。你让自己的事情变得更难了。一、数据:

In [3]: data = [['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH':
   ...:  7.27}],
   ...: ['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}],
   ...: ['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}],
   ...: ['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}]
   ...: ,
   ...: ['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}],
   ...: ['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}],
   ...:
   ...: ['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}],
   ...: ['12319825', '40686', {'pH': 8.17}],
   ...: ['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}],
   ...:
   ...: ['12319825', '40809', {'pH': 8.42}],
   ...: ['12319825', '40827', {'pH': 8.46}],
   ...: ['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}],
   ...: ['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}],
   ...: ['12319825', '41080', {'pH': 8.4}],
   ...: ['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['123198
   ...: 25', '41134', {'pH': 8.67}]]

当你的布尔条件有点复杂时,你应该给它们自己的函数,如果只是为了可读性。在这里,我会更进一步,还将它们添加到字典中,其中的关键是对应的污染物,这将使您的生活变得非常轻松!

In [4]: def ecoli_threshold(value): return value > 320

In [5]: def ph_range(value): return not (5 < value < 9)

In [6]: test = {'Escherichia coli': ecoli_threshold, 'pH':ph_range}

让你感到困惑的关键问题是你使用了一个单个列表,但你确实需要两个。用两个空列表初始化您的字典,因为您知道您将附加到它们。

In [7]: over_dict = {'Escherichia coli':[], 'pH':[]}

最后,遍历数据:

In [8]: for entry in data:
    ...:     for pollutant, value in entry[2].items():
    ...:         over_dict[pollutant].append((entry[1], test[pollutant](value)))
    ...:

最后是输出:

In [9]: over_dict
Out[9]:
{'Escherichia coli': [('39638', False),
  ('39770', False),
  ('39987', False),
  ('40658', False),
  ('41043', False),
  ('41071', False),
  ('41101', True)],
 'pH': [('39274', False),
  ('39610', False),
  ('39638', False),
  ('39770', False),
  ('39967', False),
  ('39972', False),
  ('39987', False),
  ('40014', False),
  ('40329', False),
  ('40658', False),
  ('40686', False),
  ('40714', False),
  ('40732', False),
  ('40809', False),
  ('40827', False),
  ('41043', False),
  ('41071', False),
  ('41080', False),
  ('41101', False),
  ('41134', False)]}

【讨论】:

  • 非常感谢您的反馈!复杂的事情是,这段代码将要查看更多的污染物,并且并非所有位置都有每种污染物,因此很难手动添加列表,但我认为使用这些 cmets 我可以制定一种方法!谢谢!
  • @AmeliaMcClure 那么你最好的选择是使用defaultdict,并且扩展上述方法的其余部分应该相对简单。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-04-08
  • 1970-01-01
  • 2019-08-23
  • 1970-01-01
  • 2011-03-22
  • 2023-01-22
相关资源
最近更新 更多