【问题标题】:Convert Python list to JSON document将 Python 列表转换为 JSON 文档
【发布时间】:2020-04-01 13:52:28
【问题描述】:

上下文:我有一个结构如下的列表。它可以包含可变数量的项目,是 3 的倍数。我正在尝试将每组 3 转换为单独的 JSON 文档。

['SCAN1.txt', 'Lastmodified:20191125.121049', 'Filesize:7196', 'SCAN2.txt', 'Lastmodified:20191125.121017', 'Filesize:3949', 'SCAN3.txt', 'Lastmodified:20191125.121056', 'Filesize:2766']

问题:如何将单个列表转换为具有以下格式的 JSON 文档,同时允许它可以容纳的文件数量可变:

{  
  {  
    "File": {  
      "File_Name":"SCAN1.txt",  
      "Last_Modified":"20191125.121049",
      "File_Size":"7196"  
    } 
  {  
    "File": {  
      "File_Name":"SCAN2.txt",  
      "Last_Modified":"20191125.121017",
      "File_Size":"3949"  
    } 
  }
  {  
    "File": {  
      "File_Name":"SCAN3.txt",  
      "Last_Modified":"20191125.121056",
      "File_Size":"2766"  
    }
  } 
}

【问题讨论】:

    标签: python json list


    【解决方案1】:

    使用来自 more-itertools 的chunked

        from more_itertools import chunked
        import json
    
        example = ['SCAN1.txt', 'Lastmodified:20191125.121049', 'Filesize:7196', 'SCAN2.txt', 'Lastmodified:20191125.121017', 'Filesize:3949', 'SCAN3.txt', 'Lastmodified:20191125.121056', 'Filesize:2766']
    
        def file_to_json(file):
            return {"File": {"File_Name": file[0], "Last_Modified": file[1], "File_Size": file[2]}} 
    
        json.dumps([file_to_json(file) for file in list(chunked(example, 3))])
    

    输出:

            [{
            "File": {
                "File_Name": "SCAN1.txt",
                "Last_Modified": "Lastmodified:20191125.121049",
                "File_Size": "Filesize:7196"
            }
        }, {
            "File": {
                "File_Name": "SCAN2.txt",
                "Last_Modified": "Lastmodified:20191125.121017",
                "File_Size": "Filesize:3949"
            }
        }, {
            "File": {
                "File_Name": "SCAN3.txt",
                "Last_Modified": "Lastmodified:20191125.121056",
                "File_Size": "Filesize:2766"
            }
        }]
    

    【讨论】:

    • 虽然这很接近,但它不能适应潜在文件数量的变化。列表中的每 3 个项目都需要转换为一个单独的 JSON 元素。但是文件数量可能会有变化(3的倍数)有没有办法修改file_to_json的返回值,使其可以处理任意数量的文件?
    • 我不明白。您能否修改您的示例输出以显示此“可变性”的示例?
    • 请注意,此解决方案可处理任意数量的文件。你的意思是每个文件可以有超过 3 个属性?
    【解决方案2】:

    您也可以将结果更改为将文件名作为键,因为它们是唯一的:

    {
        "SCAN1.txt": {
            "Filesize": 7196,
            "Lastmodified": 20191125.121049
        },
        "SCAN2.txt": {
            "Filesize": 3949,
            "Lastmodified": 20191125.121017
        },
        "SCAN3.txt": {
            "Filesize": 2766,
            "Lastmodified": 20191125.121056
        }
    }
    

    可以实现如下(包括cmets):

    from collections import defaultdict
    from json import dumps
    from ast import literal_eval
    
    lst = [
        "SCAN1.txt",
        "Lastmodified:20191125.121049",
        "Filesize:7196",
        "SCAN2.txt",
        "Lastmodified:20191125.121017",
        "Filesize:3949",
        "SCAN3.txt",
        "Lastmodified:20191125.121056",
        "Filesize:2766",
    ]
    
    
    def group_file_documents(lst, prefix="SCAN"):
        # Use defaultdict of dicts to represent final JSON structure
        # Also can be serialized like normal dictionaries
        result = defaultdict(dict)
    
        current_file = None
        for item in lst:
    
            # Update current file name if starts with prefix
            if item.startswith(prefix):
                current_file = item
                continue
    
            # Ensure current file name is present
            if current_file:
    
                # Split key.values and strip whitespace, just in case
                key, value = map(str.strip, item.split(":"))
    
                # Convert to actual int/float value
                result[current_file][key] = literal_eval(value)
    
        return result
    
    # Print serialized JSON file with sorted keys and indents of 4 spaces
    print(dumps(group_file_documents(lst), sort_keys=True, indent=4))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2018-12-19
      • 1970-01-01
      • 1970-01-01
      • 2012-04-14
      • 1970-01-01
      • 2014-03-03
      • 2014-02-26
      • 2012-06-13
      相关资源
      最近更新 更多