【问题标题】:How to convert a text file in certain format to a dictionary? [duplicate]如何将特定格式的文本文件转换为字典? [复制]
【发布时间】:2021-09-20 14:49:26
【问题描述】:

我有一个这样的文件,我想把它转换成 python 字典。

Week1=>
    Monday=>
        Math=8:00
        English=9:00
    Tuesday=>
        Spanish=10:00
        Arts=3:00

输出应该是:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}

这是我的实际代码:

def ReadFile(filename, extension) -> dict:
        content = {} # Content will store the dictionary (from the file).

        prefsTXT = open(f"{filename}.{extension}", "r") # Open the file with read permissions.
        lines = prefsTXT.readlines() # Read lines.

        content = LinesToDict(lines) # Calls LinesToDict to get the dictionary.

        prefsTXT.close() # Closing file.

        return content # Return dictionary with file content.
        
def LinesToDict( textList: list) -> dict:
    result = {} # Result

    lines = [i.replace("\n", "") for i in textList] # Replace the line break with nothing.

    for e, line in enumerate(lines): # Iterate through each line (of the file).

        if line[0].strip() == "#": continue # If first character is # ignore the whole line.

        keyVal, lines = TextToDict(e, lines) # Interpret each line using TextToDict function and set the value to KeyVal (and lines because is modified in TexToDict).
        keyVal = list( keyVal.items() )[0] # COnvert the dictionary to a list of tuples
        
        result[keyVal[0]] = keyVal[1] # Add key:value to result

    return result # Returns result

def TextToDict(textIndx: int, textList: list) -> dict, list:
    result = {} # Result

    text = textList[textIndx].strip() # Strip the passed line (because of the tabulations).

    if text[0].strip() == "#": return # If first character is # ignore the whole line.

    keyVal = text.split("=", 1) # Split line by = which is the separator between key:value.
    
    if keyVal[1] == ">": # If value == ">"
        indentVal, textList = TextToDict(textIndx + 1, textList) # Calls itself to see the content of the next line.
        textList.pop(textIndx + 1) # Pop the interpeted content.
    
    else: # If value isn't ">"
        indentVal = keyVal[1] # Set indentVal to the val of keyVal

    result[keyVal[0]] = indentVal # Setting keyVal[0] as key and indentVal as value of result dictionary

    return result, textList # Return result and modified lines list


file = ReadFile("trial", "txt")
print(file)

ReadFile 读取文件并将行传递给LinesToDictLinesToDict 遍历行并将每一行传递给 TextToDict
TextToDict 将行拆分为 = 并检查 val (split[1]) 是否为 == ">",如果是使用下一行调用自身,并将值存储在字典中以返回。

但我明白了:

{'Week1': {'Monday': {'Math': '8:00'}}, 'English': '9:00', 'Tuesday': {'Spanish': '10:00'}, 'Arts': '3:00'}

而不是这个:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}

【问题讨论】:

    标签: python python-3.x dictionary formatting


    【解决方案1】:

    格式看起来接近 yaml 但分隔符

    $ pip install pyyaml

    import re, yaml, json
    
    mytext = open("/tmp/txt.txt").read() # read your content here
    mydict = yaml.safe_load(re.sub("(\w+)\=(.*)", "\\1: \"\\2\"", re.sub("=>\n", ":\n", mytext)))
    print(json.dumps(mydict, indent=True))
    

    输出:

    {
     "Week1": {
      "Monday": {
       "Math": "8:00",
       "English": "9:00"
      },
      "Tuesday": {
       "Spanish": "10:00",
       "Arts": "3:00"
      }
     }
    }
    

    【讨论】:

    • 你能稍微解释一下吗?
    • @n1c9 加了一点解释
    猜你喜欢
    • 1970-01-01
    • 2020-11-24
    • 1970-01-01
    • 2020-12-09
    • 2017-11-08
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多