从文本文件填充字典？答案

【问题标题】：Populating a dictionary from a text file?从文本文件填充字典？
【发布时间】：2020-08-13 06:42:58
【问题描述】：

所以我有一个看起来像这样的文本文件：

Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water

Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot

Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites

我想从这个文件中创建一个字典，其中植物的名称作为字典的键，其余信息作为值放入列表中。到目前为止，我已经设法将每种植物及其各自的信息作为列表中的一个项目获取，但我不确定如何将其转换为字典。

    with open('myplants.txt', 'r') as f:
        contents = f.read()
        contents = contents.rstrip().split('\n\n')
        contents = [x.replace('\n', ', ') for x in contents]
    print(contents)#[0].split(',',0)[0])

预期输出：

plants = {'Monstera Deliciosa':['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': ... }

我愿意接受更好的字典格式。

【问题讨论】：

标签： python python-3.x dictionary file-handling

【解决方案1】：

这是一个可扩展的解决方案，可避免读取内存中的整个文件。

它利用了文本文件可以用作产生每一行的迭代器这一事实

import itertools as it

plants = {}
with open('myplants.txt') as f:
    while True:
        try:
            p = next(f).rstrip()
            plants[p] = list(l.rstrip() for l in it.takewhile(lambda line: line != '\n', f))
        except StopIteration:
            break

print(plants)

生产

{
 'Monstera Deliciosa': ['2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil', 'Semi-shade/full shade light requirements', 'Water only when top 2 inches of soil is dry', 'Intolerant to root rot', 'Propagate by cuttings in water'],
 'Strelitzia Nicolai (White Birds of Paradise)': ['2018-11-05 10:12:15', 'Semi-shade, full sun', 'Dark green leathery leaves', 'Like lots of water,but soil cannot be water-logged', 'Like to be root bound in pot'],
 'Alocasia Macrorrhizos': ['2019-01-03 15:29:10', 'Tropical asia', 'Moist and well-draining soil', 'Leaves and stem toxic upon ingestion', 'Semi-shade, full sun', 'Like lots of water, less susceptible to root rot', 'Susceptible to spider mites']
}

【讨论】：

【解决方案2】：

这是一种使用状态解析数据的方法：

def parse(lines):
    items = []
    state = "name"

    for line in lines:
        line = line.rstrip("\n")

        if line == "":
            state = "name"
            continue

        if state == "name":
            item = {"name": line, "date": None, "data": []}
            items.append(item)
            state = "date"
            continue

        if state == "date":
            item["date"] = line
            state = "data"
            continue

        if state == "data":
            item["data"].append(line)
            continue

    return items

结果：

[{'name': 'Monstera Deliciosa',
  'date': '2018-11-03 18:21:26',
  'data': ['Tropical/sub-Tropical plant',
           'Leathery leaves, mid to dark green',
           'Moist and well-draining soil',
           'Semi-shade/full shade light requirements',
           'Water only when top 2 inches of soil is dry',
           'Intolerant to root rot',
           'Propagate by cuttings in water']},
 {'name': 'Strelitzia Nicolai (White Birds of Paradise)',
  'date': '2018-11-05 10:12:15',
  'data': ['Semi-shade, full sun',
           'Dark green leathery leaves',
           'Like lots of water,but soil cannot be water-logged',
           'Like to be root bound in pot']},
 {'name': 'Alocasia Macrorrhizos',
  'date': '2019-01-03 15:29:10',
  'data': ['Tropical asia',
           'Moist and well-draining soil',
           'Leaves and stem toxic upon ingestion',
           'Semi-shade, full sun',
           'Like lots of water, less susceptible to root rot',
           'Susceptible to spider mites']}]

我认为这种替代表示更易于使用。

【讨论】：

【解决方案3】：

使用字典理解：

text = """Monstera Deliciosa
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water

Strelitzia Nicolai (White Birds of Paradise)
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot

Alocasia Macrorrhizos
2019-01-03 15:29:10
Tropical asia
Moist and well-draining soil
Leaves and stem toxic upon ingestion
Semi-shade, full sun
Like lots of water, less susceptible to root rot
Susceptible to spider mites
"""

contents = text.rstrip().split('\n\n')
contents = [x.replace('\n', ', ') for x in contents]

plants = {c.split(',')[0]: c.split(',')[1:]
          for c in contents}

print(plants)

{'Monstera Deliciosa': [' 2018-11-03 18:21:26', ' Tropical/sub-Tropical plant', ' Leathery leaves', ' mid to dark green', ' Moist and well-draining soil', ' Semi-shade/full shade light requirements', ' Water only when top 2 inches of soil is dry', ' Intolerant to root rot', ' Propagate by cuttings in water'], 'Strelitzia Nicolai (White Birds of Paradise)': [' 2018-11-05 10:12:15', ' Semi-shade', ' full sun', ' Dark green leathery leaves', ' Like lots of water', 'but soil cannot be water-logged', ' Like to be root bound in pot'], 'Alocasia Macrorrhizos': [' 2019-01-03 15:29:10', ' Tropical asia', ' Moist and well-draining soil', ' Leaves and stem toxic upon ingestion', ' Semi-shade', ' full sun', ' Like lots of water', ' less susceptible to root rot', ' Susceptible to spider mites']}

【讨论】：

【解决方案4】：

这样的东西会起作用吗？

plants = {}
with open('myplants.txt', 'r') as f:
    contents = f.read()
    contents = contents.rstrip().split('\n\n')
    for content in contents:
      parts = content.split('\n') # Convert the lines to a list of strings
      plants[ parts[0] ] = parts[1:] # first line becomes key, the rest become the values
print(plants)

【讨论】：

它适用于适合系统可用内存的文件。非常不鼓励和不必要地阅读内存中的全部内容