将自定义字符串转换为字典答案

【问题标题】：Convert Custom String to Dict将自定义字符串转换为字典
【发布时间】：2020-11-21 15:01:57
【问题描述】：

你好， 我需要将这种string 转换为向下dict

string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"

DICT

requirements={
'Os':'Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)',
'Processor':' Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or better',
'Memory':'6 GB RAM',
'Graphics':'VIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List',
}

我试过了

string = string.split(':')

并用这样的字典存储每个列表

requirements['Os'] = string[0]
requirements['Processor'] = string[1]

但这不是正确的方法！这给我带来了更多的错误。那么，这些东西是否有任何自定义函数或模块？

【问题讨论】：

这将很难格式化，因为您有部分字符串已连接，即only)Processor: ，除非标题的顺序不会改变。它会始终遵循您提供的示例输入吗？
是的，它总是遵循我为演示给出的示例输入！都是一样的

标签： python python-3.x string dictionary

【解决方案1】：

我会使用正则表达式来捕获您想要的文本，因为输入字符串的实际格式不会改变。这应该给你想要的：


import re

string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"

matches = re.match(r'OS: (.+)Processor: (.+)Memory: (.+)Graphics: (.+)', string)

requirements = {
    'Os': matches.group(1),
    'Processor': matches.group(2),
    'Memory': matches.group(3),
    'Graphics': matches.group(4),
}

print(requirements)

不过，正则表达式有点不灵活，我建议仅以此为起点。

见re.match

【讨论】：

【解决方案2】：

这是一种替代的非正则表达式解决方案，尽管正则表达式原则上可能更高效、更简洁：

input_string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"
# Splits by space
input_string = input_string.split()

# Assumes the keys are exactly like listed - including uppercase letters
key_list = ["OS", "Processor", "Memory", "Graphics"]
key_ind = []

output = {}

# Collect indices corresponding to each key
for key in key_list:
    for idx, el in enumerate(input_string):
        if key in el:
            key_ind.append(idx)
            break

# Build the dictionary
for idx, key in enumerate(key_list):
    if idx + 1 >= len(key_list):
        output[key] = (' ').join(input_string[key_ind[idx]+1:])
    else:
        lp_idx = input_string[key_ind[idx+1]].find(key_list[idx+1])
        lp = input_string[key_ind[idx+1]][:lp_idx]
        output[key] = (' ').join(input_string[key_ind[idx]+1:key_ind[idx+1]]) + ' ' + lp

print(output)

这里首先根据空格分割字符串，然后代码找到每个包含未来字典键标签的代码块的位置。存储每个键的索引后，代码基于它们构建字典，最后一个元素是一个特殊情况。

对于除last之外的所有元素，代码还提取下一个键之前的信息。这假设在下一个键和要为当前键存储的文本的最后部分之间没有空格，即它始终是 (64bit versions only)Processor: 而不是 (64bit versions only) Processor: - 如果你不能做出这样的假设，你将需要扩展此代码以覆盖带有空格的情况。

【讨论】：