【问题标题】:How to change json encoding behaviour for serializable python object?如何更改可序列化 python 对象的 json 编码行为?
【发布时间】:2013-04-30 15:14:10
【问题描述】:

更改非 JSON 可序列化对象的格式很容易,例如 datetime.datetime。

出于调试目的,我的要求是改变一些自定义对象从基本对象扩展的方式,例如 dictlist ,以 json 格式序列化。代码:

import datetime
import json

def json_debug_handler(obj):
    print("object received:")
    print type(obj)
    print("\n\n")
    if  isinstance(obj, datetime.datetime):
        return obj.isoformat()
    elif isinstance(obj,mDict):
        return {'orig':obj , 'attrs': vars(obj)}
    elif isinstance(obj,mList):
        return {'orig':obj, 'attrs': vars(obj)}
    else:
        return None


class mDict(dict):
    pass


class mList(list):
    pass


def test_debug_json():
    games = mList(['mario','contra','tetris'])
    games.src = 'console'
    scores = mDict({'dp':10,'pk':45})
    scores.processed = "unprocessed"
    test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() }
    print(json.dumps(test_json,default=json_debug_handler))

if __name__ == '__main__':
    test_debug_json()

演示:http://ideone.com/hQJnLy

输出:

{"date": "2013-05-07T01:03:13.098727", "games": ["mario", "contra", "tetris"], "scores": {"pk": 45, "dp": 10}}

期望的输出:

{"date": "2013-05-07T01:03:13.098727", "games": { "orig": ["mario", "contra", "tetris"] ,"attrs" : { "src":"console"}} , "scores": { "orig": {"pk": 45, "dp": 10},"attrs": "processed":"unprocessed }}

default 处理程序是否不适用于可序列化对象? 如果没有,我该如何覆盖它,而不向扩展类添加 toJSON 方法?

另外,还有这个版本的 JSON 编码器不工作:

class JsonDebugEncoder(json.JSONEncoder):
    def default(self,obj):
        if  isinstance(obj, datetime.datetime):
            return obj.isoformat()
        elif isinstance(obj,mDict):
            return {'orig':obj , 'attrs': vars(obj)}
        elif isinstance(obj,mList):
            return {'orig':obj, 'attrs': vars(obj)}
        else:
            return json.JSONEncoder.default(self, obj)

如果有 pickle,__getstate__,__setstate__, 的黑客攻击,然后在 pickle.loads 对象上使用 json.dumps,我也对此持开放态度,我尝试过,但没有奏效。

【问题讨论】:

  • 使用带有__getstate()__ 方法的适当类应该可以工作。更多:stackoverflow.com/q/12627949/139010
  • @MartijnPieters 相当于写了一个复杂的自定义编码器,我希望应该有更简单的方法将对象的不同表示返回给像json或pickle这样的编码器?
  • @DhruvPathak: pickle 支持状态挂钩(__getstate__ 和同伴),但json 不支持任何此类有用的方法。
  • @DhruvPathak 听起来您想在序列化之前像 javascript 对象一样操作数据。也许像jsobject 这样的东西可能对你有用。看起来它捆绑了一个 json 编码器和解码器。

标签: python json


【解决方案1】:

Python 3 的简化(仅在 3.9 上测试):

from json.encoder import (_make_iterencode, JSONEncoder,
                          encode_basestring_ascii, INFINITY,
                          encode_basestring)

class CustomObjectEncoder(JSONEncoder):

    def iterencode(self, o, _one_shot=False):
        """Encode the given object and yield each string
        representation as available.

        For example::

            for chunk in JSONEncoder().iterencode(bigobject):
                mysocket.write(chunk)
                
        Change from json.encoder.JSONEncoder.iterencode is setting
        _one_shot=False and isinstance=self.isinstance
        in call to `_make_iterencode`.
        And not using `c_make_encoder`.

        """
        if self.check_circular:
            markers = {}
        else:
            markers = None
        if self.ensure_ascii:
            _encoder = encode_basestring_ascii
        else:
            _encoder = encode_basestring

        def floatstr(o, allow_nan=self.allow_nan,
                _repr=float.__repr__, _inf=INFINITY, _neginf=-INFINITY):
            # Check for specials.  Note that this type of test is processor
            # and/or platform-specific, so do tests which don't depend on the
            # internals.

            if o != o:
                text = 'NaN'
            elif o == _inf:
                text = 'Infinity'
            elif o == _neginf:
                text = '-Infinity'
            else:
                return _repr(o)

            if not allow_nan:
                raise ValueError(
                    "Out of range float values are not JSON compliant: " +
                    repr(o))

            return text

        _iterencode = _make_iterencode(
                markers, self.default, _encoder, self.indent, floatstr,
                self.key_separator, self.item_separator, self.sort_keys,
                self.skipkeys, _one_shot=False, isinstance=self.isinstance)
        return _iterencode(o, 0)

示例子类:

import datetime

from rdflib.term import Literal, BNode

class RDFTermEncoder(CustomObjectEncoder):
    def isinstance(self, o, cls):
        if isinstance(o, (Literal, BNode)):
            return False
        return isinstance(o, cls)
    def default(self, o):
        if isinstance(o, Literal):
            rv = {"value": o.value}
            if o.datatype is not None:
                rv["datatype"] = o.datatype
            if o.language is not None:
                rv["lang"] = o.language
            return rv
        if isinstance(o, BNode):
            return "http://localhost/bnode/" + str(o)
        if isinstance(o, datetime.datetime):
            return o.isoformat()
        if isinstance(o, datetime.date):
            return str(o)
        # Let the base class default method raise the TypeError
        return super().default(o)

我刚刚成功地将它用于我的工作

db_json = json.loads(json.dumps(db_custom, cls=RDFTermEncoder))

谢谢大家!

【讨论】:

    【解决方案2】:

    我尝试更改默认解析器优先级并更改默认迭代器输出以实现您的目的。

    1. 更改默认解析器优先级,在所有标准类型验证之前执行:

      继承 json.JSONEncoder 并覆盖 iterencode() 方法。

      所有值都应该被ValueWrapper类型包装,避免这些值被默认的标准解析器解析。

    2. 更改默认迭代器输出;

      实现三个自定义包装类 ValueWrapperListWrapperDictWrapper。 ListWrapper 实现__iter__(),DictWrapper 实现__iter__()items()iteritems()

    import datetime
    import json
    
    class DebugJsonEncoder(json.JSONEncoder):
        def iterencode(self, o, _one_shot=False):
            default_resolver = self.default
            # Rewrites the default resolve, self.default(), with the custom resolver.
            # It will process the Wrapper classes
            def _resolve(o):
                if isinstance(o, ValueWrapper):
                    # Calls custom resolver precede others. Due to the _make_iterencode()
                    # call the custom resolver following by all standard type verifying 
                    # failed. But we want custom resolver can be executed by all standard 
                    # verifying.
                    # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L442
                    result = default_resolver(o.data)
                    if (o.data is not None) and (result is not None):
                        return result
                    elif isinstance(o.data, (list, tuple)):
                        return ListWrapper(o.data)
                    elif isinstance(o.data, dict):
                        return DictWrapper(o.data)
                    else:
                        return o.data
                else:
                    return default_resolver(o)
    
            # re-assign the default resolver self.default with custom resolver.
            # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L161
            self.default = _resolve
            # The input value must be wrapped by ValueWrapper, avoid the values are 
            # resolved by the standard resolvers.
            # The last one arguemnt _one_shot must be False, we want to encode with
            # _make_iterencode().
            # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L259
            return json.JSONEncoder.iterencode(self, _resolve(ValueWrapper(o)), False)
    
    
    class ValueWrapper():
        """
        a wrapper wrapped the given object
        """
    
        def __init__(self, o):
            self.data = o
    
    class ListWrapper(ValueWrapper, list):
        """
        a wrapper wrapped the given list
        """
    
        def __init__(self, o):
            ValueWrapper.__init__(self, o)
    
        # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L307
        def __iter__(self):
            for chunk in self.data:
                yield ValueWrapper(chunk)
    
    class DictWrapper(ValueWrapper, dict):
        """
        a wrapper wrapped the given dict
        """
    
        def __init__(self, d):
            dict.__init__(self, d)
    
        def __iter__(self):
            for key, value in dict.items(self):
                yield key, ValueWrapper(value)
    
        # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L361
        def items(self):
            for key, value in dict.items(self):
                yield key, ValueWrapper(value)
    
        # see https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L363
        def iteritems(self):
            for key, value in dict.iteritems(self):
                yield key, ValueWrapper(value)
    
    
    def json_debug_handler(obj):
        print("object received:")
        print type(obj)
        print("\n\n")
        if  isinstance(obj, datetime.datetime):
            return obj.isoformat()
        elif isinstance(obj,mDict):
            return {'orig':obj , 'attrs': vars(obj)}
        elif isinstance(obj,mList):
            return {'orig':obj, 'attrs': vars(obj)}
        else:
            return None
    
    
    class mDict(dict):
        pass
    
    class mList(list):
        pass
    
    
    def test_debug_json():
        games = mList(['mario','contra','tetris'])
        games.src = 'console'
        scores = mDict({'dp':10,'pk':45})
        scores.processed = "unprocessed"
        test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now(), 'default': None}
        print(json.dumps(test_json,cls=DebugJsonEncoder,default=json_debug_handler))
    
    if __name__ == '__main__':
        test_debug_json()
    

    【讨论】:

      【解决方案3】:

      我们可以对test_json 进行预处理,使其适合您的要求吗?操作 python dict 比编写无用的 Encode 更容易。

      import datetime
      import json
      class mDict(dict):
          pass
      
      class mList(list):
          pass
      
      def prepare(obj):
          if  isinstance(obj, datetime.datetime):
              return obj.isoformat()
          elif isinstance(obj, mDict):
              return {'orig':obj , 'attrs': vars(obj)}
          elif isinstance(obj, mList):
              return {'orig':obj, 'attrs': vars(obj)}
          else:
              return obj
      def preprocessor(toJson):
          ret ={}
          for key, value in toJson.items():
              ret[key] = prepare(value)
          return ret
      if __name__ == '__main__':
          def test_debug_json():
              games = mList(['mario','contra','tetris'])
              games.src = 'console'
              scores = mDict({'dp':10,'pk':45})
              scores.processed = "unprocessed"
              test_json = { 'games' : games, 'scores' : scores , 'date': datetime.datetime.now() }
              print(json.dumps(preprocessor(test_json)))
          test_debug_json()
      

      【讨论】:

        【解决方案4】:

        如果您只是在寻找序列化而不是反序列化,那么您可以在将对象发送到json.dumps 之前对其进行处理。见下例

        import datetime
        import json
        
        
        def is_inherited_from(obj, objtype):
            return isinstance(obj, objtype) and not type(obj).__mro__[0] == objtype
        
        
        def process_object(data):
            if isinstance(data, list):
                if is_inherited_from(data, list):
                    return process_object({"orig": list(data), "attrs": vars(data)})
                new_data = []
                for d in data:
                    new_data.append(process_object(d))
            elif isinstance(data, tuple):
                if is_inherited_from(data, tuple):
                    return process_object({"orig": tuple(data), "attrs": vars(data)})
                new_data = []
                for d in data:
                    new_data.append(process_object(d))
                return tuple(new_data)
            elif isinstance(data, dict):
                if is_inherited_from(data, dict):
                    return process_object({"orig": list(data), "attrs": vars(data)})
                new_data = {}
                for k, v in data.items():
                    new_data[k] = process_object(v)
            else:
                return data
            return new_data
        
        
        def json_debug_handler(obj):
            print("object received:")
            print("\n\n")
            if isinstance(obj, datetime.datetime):
                return obj.isoformat()
        
        
        class mDict(dict):
            pass
        
        
        class mList(list):
            pass
        
        
        def test_debug_json():
            games = mList(['mario', 'contra', 'tetris'])
            games.src = 'console'
            scores = mDict({'dp': 10, 'pk': 45})
            scores.processed = "unprocessed"
            test_json = {'games': games, 'scores': scores, 'date': datetime.datetime.now()}
            new_object = process_object(test_json)
            print(json.dumps(new_object, default=json_debug_handler))
        
        
        if __name__ == '__main__':
            test_debug_json()
        

        同样的输出是

        {"games": {"orig": ["mario", "contra", "tetris"], "attrs": {"src": "console"}}, "scores": {"orig" :[“dp”,“pk”],“attrs”:{“已处理”:“未处理”}},“日期”:“2018-01-24T12:59:36.581689”}

        也可以覆盖 JSONEncoder,但由于它使用嵌套方法,它会很复杂,需要下面讨论的技术

        Can you patch *just* a nested function with closure, or must the whole outer function be repeated?

        既然你想保持简单,我不建议走那条路

        【讨论】:

          【解决方案5】:

          仅当被转储的节点本身不是可序列化的并且您的 mDict 类按原样序列化时,才会调用默认函数。这是一个小演示,显示何时调用默认值以及何时不调用:

          import json
          
          def serializer(obj):
              print 'serializer called'
              return str(obj)
          
          class mDict(dict):
              pass
          
          class mSet(set):
              pass
          
          d = mDict(dict(a=1))
          print json.dumps(d, default=serializer)
          
          s = mSet({1, 2, 3,})
          print json.dumps(s, default=serializer)
          

          还有输出:

          {"a": 1}
          serializer called
          "mSet([1, 2, 3])"
          

          请注意,sets 不是本机可序列化的,但 dicts 是。

          由于您的 m___ 类是可序列化的,因此您的处理程序永远不会被调用。

          更新 #1 -----

          您可以更改 JSON 编码器代码。如何执行此操作的详细信息取决于您使用的 JSON 实现。比如在simplejson中,相关代码是这样的,在encode.py中:

          def _iterencode(o, _current_indent_level):
              ...
                  for_json = _for_json and getattr(o, 'for_json', None)
                  if for_json and callable(for_json):
                      ...
                  elif isinstance(o, list):
                      ...
                  else:
                      _asdict = _namedtuple_as_object and getattr(o, '_asdict', None)
                      if _asdict and callable(_asdict):
                          for chunk in _iterencode_dict(_asdict(),
                                  _current_indent_level):
                              yield chunk
                      elif (_tuple_as_array and isinstance(o, tuple)):
                          ...
                      elif isinstance(o, dict):
                          ...
                      elif _use_decimal and isinstance(o, Decimal):
                          ...
                      else:
                          ...
                          o = _default(o)
                          for chunk in _iterencode(o, _current_indent_level):
                              yield chunk
                          ...
          

          换句话说,只有当被编码的节点不是可识别的基本类型之一时,才会调用默认行为。您可以通过以下几种方式之一覆盖它:

          1 -- JSONEncoder 的子类,就像您在上面所做的那样,但在其初始化程序中添加一个参数,指定要使用的函数来代替标准 _make_iterencode,在该标准中添加一个测试,该测试将为满足的类调用默认值你的标准。这是一种干净的方法,因为您没有更改 JSON 模块,但您将重复原始 _make_iterencode 中的大量代码。 (此方法的其他变体包括猴子补丁 _make_iterencode 或其子函数 _iterencode_dict)。

          2 -- 更改 JSON 模块源,并使用 __debug__ 常量更改行为:

          def _iterencode(o, _current_indent_level):
              ...
                  for_json = _for_json and getattr(o, 'for_json', None)
                  if for_json and callable(for_json):
                      ...
                  elif isinstance(o, list):
                      ...
                  ## added code below
                  elif __debug__:
                      o = _default(o)
                      for chunk in _iterencode(o, _current_indent_level):
                          yield chunk
                  ## added code above
                  else:
                      ...
          

          理想情况下,JSONEncoder 类会提供一个参数来指定“对所有类型使用默认值”,但事实并非如此。以上是一个简单的一次性更改,可以满足您的需求。

          【讨论】:

          • 先生,您的回答就是我的问题。我已经提到了如何改变可序列化对象的行为。代码链接在问题中:ideone.com/hQJnLy
          【解决方案6】:

          FastTurtle 的答案可能是一个更清洁的解决方案。

          根据我的问题/答案中解释的技术,这里有一些接近您想要的东西:Overriding nested JSON encoding of inherited default supported objects like dict, list

          import json
          import datetime
          
          
          class mDict(dict):
              pass
          
          
          class mList(list):
              pass
          
          
          class JsonDebugEncoder(json.JSONEncoder):
              def _iterencode(self, o, markers=None):
                  if isinstance(o, mDict):
                      yield '{"__mDict__": '
                      # Encode dictionary
                      yield '{"orig": '
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):
                          yield chunk
                      yield ', '
                      # / End of Encode dictionary
                      # Encode attributes
                      yield '"attr": '
                      for key, value in o.__dict__.iteritems():
                          yield '{"' + key + '": '
                          for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):
                              yield chunk
                          yield '}'
                      yield '}'
                      # / End of Encode attributes
                      yield '}'
                  elif isinstance(o, mList):
                      yield '{"__mList__": '
                      # Encode list
                      yield '{"orig": '
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):
                          yield chunk
                      yield ', '
                      # / End of Encode list
                      # Encode attributes
                      yield '"attr": '
                      for key, value in o.__dict__.iteritems():
                          yield '{"' + key + '": '
                          for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):
                              yield chunk
                          yield '}'
                      yield '}'
                      # / End of Encode attributes
                      yield '}'
                  else:
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers):
                          yield chunk
          
              def default(self, obj):
                  if isinstance(obj, datetime.datetime):
                      return obj.isoformat()
          
          
          class JsonDebugDecoder(json.JSONDecoder):
              def decode(self, s):
                  obj = super(JsonDebugDecoder, self).decode(s)
                  obj = self.recursiveObjectDecode(obj)
                  return obj
          
              def recursiveObjectDecode(self, obj):
                  if isinstance(obj, dict):
                      decoders = [("__mList__", self.mListDecode),
                                  ("__mDict__", self.mDictDecode)]
                      for placeholder, decoder in decoders:
                          if placeholder in obj:                  # We assume it's supposed to be converted
                              return decoder(obj[placeholder])
                          else:
                              for k in obj:
                                  obj[k] = self.recursiveObjectDecode(obj[k])
                  elif isinstance(obj, list):
                      for x in range(len(obj)):
                          obj[x] = self.recursiveObjectDecode(obj[x])
                  return obj
          
              def mDictDecode(self, o):
                  res = mDict()
                  for key, value in o['orig'].iteritems():
                      res[key] = self.recursiveObjectDecode(value)
                  for key, value in o['attr'].iteritems():
                      res.__dict__[key] = self.recursiveObjectDecode(value)
                  return res
          
              def mListDecode(self, o):
                  res = mList()
                  for value in o['orig']:
                      res.append(self.recursiveObjectDecode(value))
                  for key, value in o['attr'].iteritems():
                      res.__dict__[key] = self.recursiveObjectDecode(value)
                  return res
          
          
          def test_debug_json():
              games = mList(['mario','contra','tetris'])
              games.src = 'console'
              scores = mDict({'dp':10,'pk':45})
              scores.processed = "unprocessed"
              test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() }
              jsonDump = json.dumps(test_json, cls=JsonDebugEncoder)
              print jsonDump
              test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder)
              print test_pyObject
          
          if __name__ == '__main__':
              test_debug_json()
          

          这会导致:

          {"date": "2013-05-06T22:28:08.967000", "games": {"__mList__": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}}, "scores": {"__mDict__": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}}}
          

          通过这种方式,您可以对其进行编码并将其解码回它来自的 python 对象。

          编辑:

          这是一个实际将其编码为您想要的输出并且也可以对其进行解码的版本。每当字典包含 'orig' 和 'attr' 时,它会检查 'orig' 是否包含字典或列表,如果是,则将对象分别转换回 mDict 或 mList。

          import json
          import datetime
          
          
          class mDict(dict):
              pass
          
          
          class mList(list):
              pass
          
          
          class JsonDebugEncoder(json.JSONEncoder):
              def _iterencode(self, o, markers=None):
                  if isinstance(o, mDict):    # Encode mDict
                      yield '{"orig": '
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):
                          yield chunk
                      yield ', '
                      yield '"attr": '
                      for key, value in o.__dict__.iteritems():
                          yield '{"' + key + '": '
                          for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):
                              yield chunk
                          yield '}'
                      yield '}'
                      # / End of Encode attributes
                  elif isinstance(o, mList):    # Encode mList
                      yield '{"orig": '
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):
                          yield chunk
                      yield ', '
                      yield '"attr": '
                      for key, value in o.__dict__.iteritems():
                          yield '{"' + key + '": '
                          for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):
                              yield chunk
                          yield '}'
                      yield '}'
                  else:
                      for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers):
                          yield chunk
          
              def default(self, obj):
                  if isinstance(obj, datetime.datetime):    # Encode datetime
                      return obj.isoformat()
          
          
          class JsonDebugDecoder(json.JSONDecoder):
              def decode(self, s):
                  obj = super(JsonDebugDecoder, self).decode(s)
                  obj = self.recursiveObjectDecode(obj)
                  return obj
          
              def recursiveObjectDecode(self, obj):
                  if isinstance(obj, dict):
                      if "orig" in obj and "attr" in obj and isinstance(obj["orig"], list):
                          return self.mListDecode(obj)
                      elif "orig" in obj and "attr" in obj and isinstance(obj['orig'], dict):
                          return self.mDictDecode(obj)
                      else:
                          for k in obj:
                              obj[k] = self.recursiveObjectDecode(obj[k])
                  elif isinstance(obj, list):
                      for x in range(len(obj)):
                          obj[x] = self.recursiveObjectDecode(obj[x])
                  return obj
          
              def mDictDecode(self, o):
                  res = mDict()
                  for key, value in o['orig'].iteritems():
                      res[key] = self.recursiveObjectDecode(value)
                  for key, value in o['attr'].iteritems():
                      res.__dict__[key] = self.recursiveObjectDecode(value)
                  return res
          
              def mListDecode(self, o):
                  res = mList()
                  for value in o['orig']:
                      res.append(self.recursiveObjectDecode(value))
                  for key, value in o['attr'].iteritems():
                      res.__dict__[key] = self.recursiveObjectDecode(value)
                  return res
          
          
          def test_debug_json():
              games = mList(['mario','contra','tetris'])
              games.src = 'console'
              scores = mDict({'dp':10,'pk':45})
              scores.processed = "unprocessed"
              test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() }
              jsonDump = json.dumps(test_json, cls=JsonDebugEncoder)
              print jsonDump
              test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder)
              print test_pyObject
              print test_pyObject['games'].src
          
          if __name__ == '__main__':
              test_debug_json()
          

          以下是有关输出的更多信息:

          # Encoded
          {"date": "2013-05-06T22:41:35.498000", "games": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}, "scores": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}}
          
          # Decoded ('games' contains the mList with the src attribute and 'scores' contains the mDict processed attribute)
          # Note that printing the python objects doesn't directly show the processed and src attributes, as seen below.
          {u'date': u'2013-05-06T22:41:35.498000', u'games': [u'mario', u'contra', u'tetris'], u'scores': {u'pk': 45, u'dp': 10}}
          

          对于任何错误的命名约定,我们深表歉意,这是一个快速设置。 ;)

          注意:日期时间不会被解码回 Python 表示。可以通过检查任何名为“日期”并包含日期时间的有效字符串表示的字典键来实现。

          【讨论】:

          • 谢谢。但是我仍然觉得应该存在一些我们不知道的更简单的技巧。
          【解决方案7】:

          正如其他人已经指出的那样,默认处理程序仅会为不属于可识别类型之一的值调用。对于这个问题,我建议的解决方案是预处理要序列化的对象,递归列表、元组和字典,但将所有其他值包装在自定义类中。

          类似这样的:

          def debug(obj):
              class Debug:
                  def __init__(self,obj):
                      self.originalObject = obj
              if obj.__class__ == list:
                  return [debug(item) for item in obj]
              elif obj.__class__ == tuple:
                  return (debug(item) for item in obj)
              elif obj.__class__ == dict:
                  return dict((key,debug(obj[key])) for key in obj)
              else:
                  return Debug(obj)
          

          在将对象传递给 json.dumps 之前,您可以调用此函数,如下所示:

          test_json = debug(test_json)
          print(json.dumps(test_json,default=json_debug_handler))
          

          请注意,此代码正在检查其类与列表、元组或字典完全匹配的对象,因此从这些类型扩展的任何自定义对象都将被包装而不是解析。因此,常规列表、元组和字典将照常进行序列化,但所有其他值将传递给默认处理程序。

          所有这一切的最终结果是,到达默认处理程序的每个值都保证被包装在这些 Debug 类之一中。所以你要做的第一件事就是提取原始对象,如下所示:

          obj = obj.originalObject
          

          然后您可以检查原始对象的类型并处理需要特殊处理的类型。对于其他一切,您应该只返回原始对象(因此处理程序的最后返回应该是return obj 而不是return None)。

          def json_debug_handler(obj):
              obj = obj.originalObject      # Add this line
              print("object received:")
              print type(obj)
              print("\n\n")
              if  isinstance(obj, datetime.datetime):
                  return obj.isoformat()
              elif isinstance(obj,mDict):
                  return {'orig':obj, 'attrs': vars(obj)}
              elif isinstance(obj,mList):
                  return {'orig':obj, 'attrs': vars(obj)}
              else:
                  return obj                # Change this line
          

          请注意,此代码不会检查不可序列化的值。这些将通过最终的return obj,然后将被序列化程序拒绝并再次传递回默认处理程序 - 只是这次没有调试包装器。

          如果您需要处理这种情况,您可以在处理程序的顶部添加一个检查,如下所示:

          if not hasattr(obj, 'originalObject'):
              return None
          

          Ideone 演示:http://ideone.com/tOloNq

          【讨论】:

            【解决方案8】:

            试试下面的。它产生你想要的输出并且看起来相对简单。与您的编码器类的唯一真正区别是我们应该覆盖 decode 和 encode 方法(因为编码器知道如何处理的类型仍然调用后者)。

            import json
            import datetime
            
            class JSONDebugEncoder(json.JSONEncoder):
                # transform objects known to JSONEncoder here
                def encode(self, o, *args, **kw):
                    for_json = o
                    if isinstance(o, mDict):
                        for_json = { 'orig' : o, 'attrs' : vars(o) }
                    elif isinstance(o, mList):
                        for_json = { 'orig' : o, 'attrs' : vars(o) }
                    return super(JSONDebugEncoder, self).encode(for_json, *args, **kw)
            
                # handle objects not known to JSONEncoder here
                def default(self, o, *args, **kw):
                    if isinstance(o, datetime.datetime):
                        return o.isoformat()
                    else:
                        return super(JSONDebugEncoder, self).default(o, *args, **kw)
            
            
            class mDict(dict):
                pass
            
            class mList(list):
                pass
            
            def test_debug_json():
                games = mList(['mario','contra','tetris'])
                games.src = 'console'
                scores = mDict({'dp':10,'pk':45})
                scores.processed = "unprocessed"
                test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() }
                print(json.dumps(test_json,cls=JSONDebugEncoder))
            
            if __name__ == '__main__':
                test_debug_json()
            

            【讨论】:

              【解决方案9】:

              你应该可以覆盖JSONEncoder.encode():

              class MyEncoder(JSONEncoder):
                def encode(self, o):
                  if isinstance(o, dict):
                    # directly call JSONEncoder rather than infinite-looping through self.encode()
                    return JSONEncoder.encode(self, {'orig': o, 'attrs': vars(o)})
                  elif isinstance(o, list):
                    return JSONEncoder.encode(self, {'orig': o, 'attrs': vars(o)})
                  else:
                    return JSONEncoder.encode(self, o)
              

              然后,如果您想将其修补到 json.dumps 中,它看起来从 http://docs.buildbot.net/latest/reference/json-pysrc.html 看起来您需要将 json._default_encoder 替换为 MyEncoder 的实例。

              【讨论】:

              • 这仅适用于顶级对象。 encode 不会为嵌套值调用。
              【解决方案10】:

              如果你定义了这些to override __instancecheck__:

              def strict_check(builtin):
                  '''creates a new class from the builtin whose instance check
                  method can be overridden to renounce particular types'''
                  class BuiltIn(type):
                      def __instancecheck__(self, other):
                          print 'instance', self, type(other), other
                          if type(other) in strict_check.blacklist:
                              return False
                          return builtin.__instancecheck__(other)
                  # construct a class, whose instance check method is known.
                  return BuiltIn('strict_%s' % builtin.__name__, (builtin,), dict())
              
              # for safety, define it here.
              strict_check.blacklist = ()
              

              然后像这样to override _make_iterencode.func_defaults修补json.encoder

              # modify json encoder to use some new list/dict attr.
              import json.encoder
              # save old stuff, never know when you need it.
              old_defaults = json.encoder._make_iterencode.func_defaults
              old_encoder = json.encoder.c_make_encoder
              encoder_defaults = list(json.encoder._make_iterencode.func_defaults)
              for index, default in enumerate(encoder_defaults):
                  if default in (list, dict):
                      encoder_defaults[index] = strict_check(default)
              
              # change the defaults for _make_iterencode.
              json.encoder._make_iterencode.func_defaults = tuple(encoder_defaults)
              # disable C extension.
              json.encoder.c_make_encoder = None
              

              ...您的示例几乎可以逐字进行:

              import datetime
              import json
              
              def json_debug_handler(obj):
                  print("object received:")
                  print type(obj)
                  print("\n\n")
                  if  isinstance(obj, datetime.datetime):
                      return obj.isoformat()
                  elif isinstance(obj,mDict):
                      # degrade obj to more primitive dict()
                      # to avoid cycles in the encoding.
                      return {'orig': dict(obj) , 'attrs': vars(obj)}
                  elif isinstance(obj,mList):
                      # degrade obj to more primitive list()
                      # to avoid cycles in the encoding.
                      return {'orig': list(obj), 'attrs': vars(obj)}
                  else:
                      return None
              
              
              class mDict(dict):
                  pass
              
              
              class mList(list):
                  pass
              
              # set the stuff we want to process differently.
              strict_check.blacklist = (mDict, mList)
              
              def test_debug_json():
                  global test_json
                  games = mList(['mario','contra','tetris'])
                  games.src = 'console'
                  scores = mDict({'dp':10,'pk':45})
                  scores.processed = "unprocessed"
                  test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() }
                  print(json.dumps(test_json,default=json_debug_handler))
              
              if __name__ == '__main__':
                  test_debug_json()
              

              我需要改变的是确保没有循环:

                  elif isinstance(obj,mDict):
                      # degrade obj to more primitive dict()
                      # to avoid cycles in the encoding.
                      return {'orig': dict(obj) , 'attrs': vars(obj)}
                  elif isinstance(obj,mList):
                      # degrade obj to more primitive list()
                      # to avoid cycles in the encoding.
                      return {'orig': list(obj), 'attrs': vars(obj)}
              

              并在test_debug_json之前添加这个:

              # set the stuff we want to process differently.
              strict_check.blacklist = (mDict, mList)
              

              这是我的控制台输出:

              >>> test_debug_json()
              instance <class '__main__.strict_list'> <type 'dict'> {'date': datetime.datetime(2013, 7, 17, 12, 4, 40, 950637), 'games': ['mario', 'contra', 'tetris'], 'scores': {'pk': 45, 'dp': 10}}
              instance <class '__main__.strict_dict'> <type 'dict'> {'date': datetime.datetime(2013, 7, 17, 12, 4, 40, 950637), 'games': ['mario', 'contra', 'tetris'], 'scores': {'pk': 45, 'dp': 10}}
              instance <class '__main__.strict_list'> <type 'datetime.datetime'> 2013-07-17 12:04:40.950637
              instance <class '__main__.strict_dict'> <type 'datetime.datetime'> 2013-07-17 12:04:40.950637
              instance <class '__main__.strict_list'> <type 'datetime.datetime'> 2013-07-17 12:04:40.950637
              instance <class '__main__.strict_dict'> <type 'datetime.datetime'> 2013-07-17 12:04:40.950637
              object received:
              <type 'datetime.datetime'>
              
              
              
              instance <class '__main__.strict_list'> <class '__main__.mList'> ['mario', 'contra', 'tetris']
              instance <class '__main__.strict_dict'> <class '__main__.mList'> ['mario', 'contra', 'tetris']
              instance <class '__main__.strict_list'> <class '__main__.mList'> ['mario', 'contra', 'tetris']
              instance <class '__main__.strict_dict'> <class '__main__.mList'> ['mario', 'contra', 'tetris']
              object received:
              <class '__main__.mList'>
              
              
              
              instance <class '__main__.strict_list'> <type 'dict'> {'attrs': {'src': 'console'}, 'orig': ['mario', 'contra', 'tetris']}
              instance <class '__main__.strict_dict'> <type 'dict'> {'attrs': {'src': 'console'}, 'orig': ['mario', 'contra', 'tetris']}
              instance <class '__main__.strict_list'> <type 'dict'> {'src': 'console'}
              instance <class '__main__.strict_dict'> <type 'dict'> {'src': 'console'}
              instance <class '__main__.strict_list'> <type 'list'> ['mario', 'contra', 'tetris']
              instance <class '__main__.strict_list'> <class '__main__.mDict'> {'pk': 45, 'dp': 10}
              instance <class '__main__.strict_dict'> <class '__main__.mDict'> {'pk': 45, 'dp': 10}
              instance <class '__main__.strict_list'> <class '__main__.mDict'> {'pk': 45, 'dp': 10}
              instance <class '__main__.strict_dict'> <class '__main__.mDict'> {'pk': 45, 'dp': 10}
              object received:
              <class '__main__.mDict'>
              
              
              
              instance <class '__main__.strict_list'> <type 'dict'> {'attrs': {'processed': 'unprocessed'}, 'orig': {'pk': 45, 'dp': 10}}
              instance <class '__main__.strict_dict'> <type 'dict'> {'attrs': {'processed': 'unprocessed'}, 'orig': {'pk': 45, 'dp': 10}}
              instance <class '__main__.strict_list'> <type 'dict'> {'processed': 'unprocessed'}
              instance <class '__main__.strict_dict'> <type 'dict'> {'processed': 'unprocessed'}
              instance <class '__main__.strict_list'> <type 'dict'> {'pk': 45, 'dp': 10}
              instance <class '__main__.strict_dict'> <type 'dict'> {'pk': 45, 'dp': 10}
              {"date": "2013-07-17T12:04:40.950637", "games": {"attrs": {"src": "console"}, "orig": ["mario", "contra", "tetris"]}, "scores": {"attrs": {"processed": "unprocessed"}, "orig": {"pk": 45, "dp": 10}}}
              

              【讨论】:

                【解决方案11】:

                如果您能够更改调用 json.dumps 的方式。您可以在 JSON 编码器使用它之前完成所有需要的处理。此版本不使用任何类型的复制,并将就地编辑结构。如果需要,您可以添加copy()

                import datetime
                import json
                import collections
                
                
                def json_debug_handler(obj):
                    print("object received:")
                    print type(obj)
                    print("\n\n")
                    if isinstance(obj, collections.Mapping):
                        for key, value in obj.iteritems():
                            if isinstance(value, (collections.Mapping, collections.MutableSequence)):
                                value = json_debug_handler(value)
                
                            obj[key] = convert(value)
                    elif isinstance(obj, collections.MutableSequence):
                        for index, value in enumerate(obj):
                            if isinstance(value, (collections.Mapping, collections.MutableSequence)):
                                value = json_debug_handler(value)
                
                            obj[index] = convert(value)
                    return obj
                
                def convert(obj):
                    if  isinstance(obj, datetime.datetime):
                        return obj.isoformat()
                    elif isinstance(obj,mDict):
                        return {'orig':obj , 'attrs': vars(obj)}
                    elif isinstance(obj,mList):
                        return {'orig':obj, 'attrs': vars(obj)}
                    else:
                        return obj
                
                
                class mDict(dict):
                    pass
                
                
                class mList(list):
                    pass
                
                
                def test_debug_json():
                    games = mList(['mario','contra','tetris'])
                    games.src = 'console'
                    scores = mDict({'dp':10,'pk':45})
                    scores.processed = "qunprocessed"
                    test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() }
                    print(json.dumps(json_debug_handler(test_json)))
                
                if __name__ == '__main__':
                    test_debug_json()
                

                在将要序列化的对象上调用json_debug_handler,然后将其传递给json.dumps。使用此模式,您还可以轻松地反转更改和/或添加额外的转换规则。

                编辑:

                如果您无法更改json.dumps 的调用方式,您可以随时对其进行monkeypatch 以执行您想要的操作。比如这样做:

                json.dumps = lambda obj, *args, **kwargs: json.dumps(json_debug_handler(obj), *args, **kwargs)
                

                【讨论】:

                  【解决方案12】:

                  似乎要在给定的限制下实现您想要的行为,您必须深入研究JSONEncoder 类。下面我写了一个自定义的JSONEncoder,它覆盖了iterencode 方法,将自定义的isinstance 方法传递给_make_iterencode。它不是世界上最干净的东西,但考虑到选项,它似乎是最好的,并且它将定制化程度降至最低。

                  # customencoder.py
                  from json.encoder import (_make_iterencode, JSONEncoder,
                                            encode_basestring_ascii, FLOAT_REPR, INFINITY,
                                            c_make_encoder, encode_basestring)
                  
                  
                  class CustomObjectEncoder(JSONEncoder):
                  
                      def iterencode(self, o, _one_shot=False):
                          """
                          Most of the original method has been left untouched.
                  
                          _one_shot is forced to False to prevent c_make_encoder from
                          being used. c_make_encoder is a funcion defined in C, so it's easier
                          to avoid using it than overriding/redefining it.
                  
                          The keyword argument isinstance for _make_iterencode has been set
                          to self.isinstance. This allows for a custom isinstance function
                          to be defined, which can be used to defer the serialization of custom
                          objects to the default method.
                          """
                          # Force the use of _make_iterencode instead of c_make_encoder
                          _one_shot = False
                  
                          if self.check_circular:
                              markers = {}
                          else:
                              markers = None
                          if self.ensure_ascii:
                              _encoder = encode_basestring_ascii
                          else:
                              _encoder = encode_basestring
                          if self.encoding != 'utf-8':
                              def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding):
                                  if isinstance(o, str):
                                      o = o.decode(_encoding)
                                  return _orig_encoder(o)
                  
                          def floatstr(o, allow_nan=self.allow_nan,
                                       _repr=FLOAT_REPR, _inf=INFINITY, _neginf=-INFINITY):
                              if o != o:
                                  text = 'NaN'
                              elif o == _inf:
                                  text = 'Infinity'
                              elif o == _neginf:
                                  text = '-Infinity'
                              else:
                                  return _repr(o)
                  
                              if not allow_nan:
                                  raise ValueError(
                                      "Out of range float values are not JSON compliant: " +
                                      repr(o))
                  
                              return text
                  
                          # Instead of forcing _one_shot to False, you can also just
                          # remove the first part of this conditional statement and only
                          # call _make_iterencode
                          if (_one_shot and c_make_encoder is not None
                                  and self.indent is None and not self.sort_keys):
                              _iterencode = c_make_encoder(
                                  markers, self.default, _encoder, self.indent,
                                  self.key_separator, self.item_separator, self.sort_keys,
                                  self.skipkeys, self.allow_nan)
                          else:
                              _iterencode = _make_iterencode(
                                  markers, self.default, _encoder, self.indent, floatstr,
                                  self.key_separator, self.item_separator, self.sort_keys,
                                  self.skipkeys, _one_shot, isinstance=self.isinstance)
                          return _iterencode(o, 0)
                  

                  您现在可以对CustomObjectEncoder 进行子类化,以便正确序列化您的自定义对象。 CustomObjectEncoder 还可以做一些很酷的事情,比如处理嵌套对象。

                  # test.py
                  import json
                  import datetime
                  from customencoder import CustomObjectEncoder
                  
                  
                  class MyEncoder(CustomObjectEncoder):
                  
                      def isinstance(self, obj, cls):
                          if isinstance(obj, (mList, mDict)):
                              return False
                          return isinstance(obj, cls)
                  
                      def default(self, obj):
                          """
                          Defines custom serialization.
                  
                          To avoid circular references, any object that will always fail
                          self.isinstance must be converted to something that is
                          deserializable here.
                          """
                          if isinstance(obj, datetime.datetime):
                              return obj.isoformat()
                          elif isinstance(obj, mDict):
                              return {"orig": dict(obj), "attrs": vars(obj)}
                          elif isinstance(obj, mList):
                              return {"orig": list(obj), "attrs": vars(obj)}
                          else:
                              return None
                  
                  
                  class mList(list):
                      pass
                  
                  
                  class mDict(dict):
                      pass
                  
                  
                  def main():
                      zelda = mList(['zelda'])
                      zelda.src = "oldschool"
                      games = mList(['mario', 'contra', 'tetris', zelda])
                      games.src = 'console'
                      scores = mDict({'dp': 10, 'pk': 45})
                      scores.processed = "unprocessed"
                      test_json = {'games': games, 'scores': scores,
                                   'date': datetime.datetime.now()}
                      print(json.dumps(test_json, cls=MyEncoder))
                  
                  if __name__ == '__main__':
                      main()
                  

                  【讨论】:

                  • 我有同样的问题,并在Fast Turtle's answer 中找到了解决方案。由于我在多个项目中都需要这个,因此我为此创建了一个 Python 包,名为jsonconversion,即Open Source。可以直接使用,不用从这里复制源码。
                  【解决方案13】:

                  为什么不能创建一个新的对象类型来传递给编码器?试试:

                  class MStuff(object):
                      def __init__(self, content):
                          self.content = content
                  
                  class mDict(MStuff):
                      pass
                  
                  class mList(MStuff):
                      pass
                  
                  def json_debug_handler(obj):
                      print("object received:")
                      print(type(obj))
                      print("\n\n")
                      if  isinstance(obj, datetime.datetime):
                          return obj.isoformat()
                      elif isinstance(obj,MStuff):
                          attrs = {}
                          for key in obj.__dict__:
                              if not ( key.startswith("_") or key == "content"):
                                  attrs[key] = obj.__dict__[key]
                  
                          return {'orig':obj.content , 'attrs': attrs}
                      else:
                          return None
                  

                  如果需要,您可以在 mDict 和 mList 上添加验证。

                  【讨论】:

                  • 我希望序列化在代码中提到的当前对象上工作
                  猜你喜欢
                  • 1970-01-01
                  • 2010-11-30
                  • 2017-09-03
                  • 1970-01-01
                  • 2019-07-23
                  • 1970-01-01
                  • 2017-03-07
                  • 2013-08-30
                  相关资源
                  最近更新 更多