多行字典：根据值中的单词替换键答案

【问题标题】：Multi-line dictionaries: Replace the key as per a word in value多行字典：根据值中的单词替换键
【发布时间】：2013-11-29 16:32:44
【问题描述】：

我有一本字典，我必须根据值集中的单词替换所有键。所以我的字典是：

  { 23: {'score': -8.639, 'char': False, 'word': 'positive'} }
  { 56: {'score': -5.6, 'char': False, 'word': 'neutral'} }
  { 89: {'score': -8.9, 'char': False, 'word': 'positive'} }
  { 34: {'score': -2.3, 'char': Tru, 'word': 'negative'} }

如果字典的值部分，即关键字是正数，那么它应该用 +1 替换键 23，对于中性，用 0 替换键 56，对于负数，用 -1 替换键 34。输出将如下所示：

  { +1: {'score': -8.639, 'char': False, 'word': 'positive'} }
  { 0: {'score': -5.6, 'char': False, 'word': 'neutral'} }
  { +1: {'score': -8.9, 'char': False, 'word': 'positive'} }
  { -1: {'score': -2.3, 'char': Tru, 'word': 'negative'} }

这是我的代码：

for n, line in enumerate(sys.stdin,1):
    d = ast.literal_eval(line)
    items = d.values()[0].items()
    if re.match("positive",d.get('sentimentoftweet')):
       n = str.replace(str(n),"+1")
    else:
       n = str.replace(str(n),"0")

它不工作并给我这个错误：

Traceback (most recent call last):
File "./linear.py", line 33, in <module>
for thing in d:
File "./linear.py", line 22, in gen_with_appropriate_name
if re.match("positive",d.get('sentimentoftweet')):
File "/usr/lib/python2.7/re.py", line 137, in match
return _compile(pattern, flags).match(string)
TypeError: expected string or buffer

【问题讨论】：

sentimentoftweet 在您的输入数据中根本不存在。
是否需要通过标准输入传递输入？将数据存储为实际的字典对象并将其传递给函数不是更容易吗？

标签： python regex string dictionary replace

【解决方案1】：

您将错误的密钥传递给re.match。对于缺少键 dict.get 返回 None，请使用 d.get('word')。

>>> import re
>>> re.match('foo', None)
Traceback (most recent call last):
  File "<ipython-input-43-c75223170494>", line 1, in <module>
    re.match('foo', None)
  File "/usr/lib/python3.3/re.py", line 156, in match
    return _compile(pattern, flags).match(string)
TypeError: expected string or buffer

您可以使用==匹配字符串或值：

if d.get('word') == 'positive':
   #do something
elif d.get('word') == 'negative'
   #do something else

代码：

import sys, ast
for line in sys.stdin:
    d = ast.literal_eval(line)
    key = list(d)[0]            #dictionary with just one key.
    if d[key]['word'] == 'positive':
        print {'+1': d[key]}
    elif d[key]['word'] == 'negative':
        print {'-1': d[key]}
    elif d[key]['word'] == 'neutral':
        print {'0': d[key]}

输出：

{'+1': {'char': False, 'score': -8.639, 'word': 'positive'}}
{'0': {'char': False, 'score': -5.6, 'word': 'neutral'}}
{'+1': {'char': False, 'score': -8.9, 'word': 'positive'}}
{'-1': {'char': True, 'score': -2.3, 'word': 'negative'}}

【讨论】：

这个特定 OP 的大多数问题都显示了这种不常见的输入。你真的认为这样处理它们可以吗？
@thefourtheye 正在进行一项研究以分析数据，因此它不在我手中，因为数据很奇怪
@kulkarni.ankita09 为什么不以易于处理的方式格式化数据？这样您就可以高效地编程。
@thefourtheye 好点，但有任何数据格式提示吗？我得到的只是 1 个包含 10,000 行的文本文件，并且必须进行分析。整个文本文件就像您在顶部看到的一样，但字段更多。
@hcwhsa 这很棒。但是， print {'-1': d[key]} 它是如何打印值的？

【解决方案2】：

大概是这样的：

#!/usr/local/cpython-3.3/bin/python

import pprint
import collections

dict_ = { 23: {'score': -8.639, 'char': False, 'word': 'positive'},
    56: {'score': -5.6, 'char': False, 'word': 'neutral'},
    89: {'score': -8.9, 'char': False, 'word': 'positive'},
    34: {'score': -2.3, 'char': True, 'word': 'negative'},
    }

new_dict = collections.defaultdict(list)
for key, value in dict_.items():
    if value['word'] == 'positive':
        key = '+1'
    elif value['word'] == 'neutral':
        key = '0'
    elif value['word'] == 'negative':
        key = '-1'
    else:
        raise ValueError('word not positive, neutral or negative')
    new_dict[key].append(value)

pprint.pprint(new_dict)

【讨论】：

【解决方案3】：

问题是d.get('word') 没有在嵌套字典中查找。

【讨论】：