【问题标题】:Python - appending to a pickled listPython - 附加到腌制列表
【发布时间】:2015-03-20 14:00:30
【问题描述】:

我正在努力在腌制文件中附加一个列表。 这是代码:

#saving high scores to a pickled file

import pickle

first_name = input("Please enter your name:")
score = input("Please enter your score:")

scores = []
high_scores = first_name, score
scores.append(high_scores)

file = open("high_scores.dat", "ab")
pickle.dump(scores, file)
file.close()

file = open("high_scores.dat", "rb")
scores = pickle.load(file)
print(scores)
file.close()

我第一次运行代码时,它会打印名称和分数。

我第二次运行代码时,它会打印出 2 个名字和 2 个分数。

我第三次运行代码时,它会打印第一个名字和分数,但它会用我输入的第三个名字和分数覆盖第二个名字和分数。我只是希望它继续添加名称和分数。我不明白为什么它会保存第一个名字并覆盖第二个名字。

【问题讨论】:

    标签: python append pickle


    【解决方案1】:

    如果您想写入和读取腌制文件,您可以为列表中的每个条目多次调用 dump。每次转储时,都会将分数附加到腌制文件中,每次加载时都会读取下一个分数。

    >>> import pickle as dill
    >>> 
    >>> scores = [('joe', 1), ('bill', 2), ('betty', 100)]
    >>> nscores = len(scores)
    >>> 
    >>> with open('high.pkl', 'ab') as f:
    …   _ = [dill.dump(score, f) for score in scores]
    ... 
    >>> 
    >>> with open('high.pkl', 'ab') as f:
    ...   dill.dump(('mary', 1000), f)
    ... 
    >>> # we added a score on the fly, so load nscores+1
    >>> with open('high.pkl', 'rb') as f:
    ...     _scores = [dill.load(f) for i in range(nscores + 1)]
    ... 
    >>> _scores
    [('joe', 1), ('bill', 2), ('betty', 100), ('mary', 1000)]
    >>>
    

    您的代码最有可能失败的原因是您将原始的scores 替换为未腌制的分数列表。因此,如果添加了任何新的分数,您会在记忆中将它们吹走。

    >>> scores
    [('joe', 1), ('bill', 2), ('betty', 100)]
    >>> f = open('high.pkl', 'wb')
    >>> dill.dump(scores, f)
    >>> f.close()
    >>> 
    >>> scores.append(('mary',1000))
    >>> scores
    [('joe', 1), ('bill', 2), ('betty', 100), ('mary', 1000)]
    >>> 
    >>> f = open('high.pkl', 'rb')
    >>> _scores = dill.load(f)
    >>> f.close()
    >>> _scores
    [('joe', 1), ('bill', 2), ('betty', 100)]
    >>> blow away the old scores list, by pointing to _scores
    >>> scores = _scores
    >>> scores
    [('joe', 1), ('bill', 2), ('betty', 100)]
    

    因此,scores 的 Python 名称引用问题比 pickle 问题更多。 Pickle 只是实例化一个新列表并将其称为 scores(在您的情况下),然后它会垃圾收集 scores 在此之前指向的任何内容。

    >>> scores = 1
    >>> f = open('high.pkl', 'rb')
    >>> scores = dill.load(f)
    >>> f.close()
    >>> scores
    [('joe', 1), ('bill', 2), ('betty', 100)]
    

    【讨论】:

    • 也感谢您的意见。 dogwynn 的解决方案效果很好,但会采纳你所说的。
    • 为什么这里有泡菜莳萝?
    • @DheerajMPai:因为我使用dill 而不是pickle,所以这就是我用于解决方案的内容。 pickle 也可以,所以我没有编辑我的代码,而是更改了导入。这一切都在挑选...dill 更擅长于此。
    【解决方案2】:

    您需要先从数据库(即您的 pickle 文件)中提取列表,然后再附加到它。

    import pickle
    import os
    
    high_scores_filename = 'high_scores.dat'
    
    scores = []
    
    # first time you run this, "high_scores.dat" won't exist
    #   so we need to check for its existence before we load 
    #   our "database"
    if os.path.exists(high_scores_filename):
        # "with" statements are very handy for opening files. 
        with open(high_scores_filename,'rb') as rfp: 
            scores = pickle.load(rfp)
        # Notice that there's no "rfp.close()"
        #   ... the "with" clause calls close() automatically! 
    
    first_name = input("Please enter your name:")
    score = input("Please enter your score:")
    
    high_scores = first_name, score
    scores.append(high_scores)
    
    # Now we "sync" our database
    with open(high_scores_filename,'wb') as wfp:
        pickle.dump(scores, wfp)
    
    # Re-load our database
    with open(high_scores_filename,'rb') as rfp:
        scores = pickle.load(rfp)
    
    print(scores)
    

    【讨论】:

    • 也感谢您为代码提供的额外解释。
    • @dogwynn:如果只是加载到新变量,为什么还要检查文件是否存在,避免真正的问题?
    • @MikeMcKerns:我想我不知道如何从不存在的文件中调用 pickle.load。我知道您可以使用 DBv2 API 模块(例如搁置)来执行此操作,但我不知道可以在“创建”模式下(无需创建)打开的类似文件的对象(pickle.load 需要)。否则,我的目标是拥有一个可以从命令行重复执行的脚本,每次添加一个新的 (name,score) 元组。
    • @Charlie:我很高兴。很高兴在书中看到另一个 Python 黑客。 :-)
    • @dogwynn: mode ab 而不是 wb 如果文件不存在则创建一个文件,如果存在则追加。此外,我并不是建议您尝试从不存在的文件中尝试 load - 只是指出 OP 的问题更普遍的是取消选择与现有列表同名的对象 - 从而破坏在dump 之后对现有列表进行的任何编辑。
    【解决方案3】:

    实际上并没有回答这个问题,但是如果有人想一次将单个项目添加到泡菜中,您可以通过...

    import pickle
    import os
    
    high_scores_filename = '/home/ubuntu-dev/Desktop/delete/high_scores.dat'
    
    scores = []
    
    # first time you run this, "high_scores.dat" won't exist
    #   so we need to check for its existence before we load
    #   our "database"
    if os.path.exists(high_scores_filename):
        # "with" statements are very handy for opening files.
        with open(high_scores_filename,'rb') as rfp:
            scores = pickle.load(rfp)
        # Notice that there's no "rfp.close()"
        #   ... the "with" clause calls close() automatically!
    
    names = ["mike", "bob", "joe"]
    
    for name in names:
        high_score = name
        print(name)
        scores.append(high_score)
    
    # Now we "sync" our database
    with open(high_scores_filename,'wb') as wfp:
        pickle.dump(scores, wfp)
    
    # Re-load our database
    with open(high_scores_filename,'rb') as rfp:
        scores = pickle.load(rfp)
    
    print(scores)
    

    【讨论】:

    • 如帖子前5个字所表示。
    【解决方案4】:

    不要使用 pickle,而是使用 h5py,这也可以解决您的目的

    with h5py.File('.\PreprocessedData.h5', 'a') as hf:
        hf["X_train"].resize((hf["X_train"].shape[0] + X_train_data.shape[0]), axis = 0)
        hf["X_train"][-X_train_data.shape[0]:] = X_train_data
    
        hf["X_test"].resize((hf["X_test"].shape[0] + X_test_data.shape[0]), axis = 0)
        hf["X_test"][-X_test_data.shape[0]:] = X_test_data
    
    
        hf["Y_train"].resize((hf["Y_train"].shape[0] + Y_train_data.shape[0]), axis = 0)
        hf["Y_train"][-Y_train_data.shape[0]:] = Y_train_data
    
        hf["Y_test"].resize((hf["Y_test"].shape[0] + Y_test_data.shape[0]), axis = 0)
        hf["Y_test"][-Y_test_data.shape[0]:] = Y_test_data
    

    source

    【讨论】:

    • 为什么不改用hickleklepto?它们都旨在为您提供简单的 dumpload 与 HDF5 的 pickle 等效语法。如果您将我的回答中的dill 替换为hickle,我相信它应该可以工作,并存储为HDF5。
    • 是的,它有效。谢谢(你的)信息。我不知道 hickle。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2016-04-06
    • 2021-11-13
    • 2021-12-03
    • 2021-07-12
    • 1970-01-01
    • 2010-12-11
    • 1970-01-01
    相关资源
    最近更新 更多