字典存储的数据库推荐答案

【问题标题】：Database recommendation for dictionary storage字典存储的数据库推荐
【发布时间】：2011-11-02 03:31:16
【问题描述】：

使用 Python，我设法让自己成为了一种术语及其含义的字典，它相当大 - x00,000 项（目前无法估计，因为它们按首字母存储在多个文件中）。
文件是具有这种结构的腌制字典对象：

dict{word, (attribute,
            kind,
            [meanings],
            [examples],
            [connections]
            )
    }

如果重要的话，它是 Python 字典对象，键是字符串，值是元组，然后这个元组由字符串或列表对象组成。

现在我计划将它们全部放入 sqlite3 数据库中，因为使用 Python 很容易。在我这样做之前，我想征求意见，如果 sqlite3 是一个不错的选择，因为我以前从未做过任何真正的数据库任务。

我知道答案取决于我想用这些数据做什么（除了它的结构），但是假设我只想将它本地存储在一个地方（文件）并且合理易于访问（查询）和可能会变身。

【问题讨论】：

ZODB 是一个对象数据库，在 Zope 框架中表现出色并经过验证

标签： python database recommendation-engine

【解决方案1】：

是的，我使用 sqlite3 来处理这种事情。字典值必须首先被腌制：

import sqlite3
import pickle
import collections

class DBDict(collections.MutableMapping):
    'Database driven dictlike object (with non-persistent in-memory option).'

    def __init__(self, db_filename=':memory:', **kwds):
        self.db = sqlite3.connect(db_filename)
        self.db.text_factory = str
        try:
            self.db.execute('CREATE TABLE dict (key text PRIMARY KEY, value text)')
            self.db.execute('CREATE INDEX key ON dict (key)')
            self.db.commit()
        except sqlite3.OperationalError:
            pass                # DB already exists
        self.update(kwds)

    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        value = pickle.dumps(value)
        self.db.execute('INSERT INTO dict VALUES (?, ?)', (key, value))
        self.db.commit()

    def __getitem__(self, key):
        cursor = self.db.execute('SELECT value FROM dict WHERE key = (?)', (key,))
        result = cursor.fetchone()
        if result is None:
            raise KeyError(key)
        return pickle.loads(result[0])

    def __delitem__(self, key):
        if key not in self:
            raise KeyError(key)
        self.db.execute('DELETE FROM dict WHERE key = (?)', (key,))
        self.db.commit()

    def __iter__(self):
        return iter([row[0] for row in self.db.execute('SELECT key FROM dict')])

    def __repr__(self):
        list_of_str = ['%r: %r' % pair for pair in self.items()]
        return '{' + ', '.join(list_of_str) + '}'

    def __len__(self):
        return len(list(iter(self)))



>>> d = DBDict(raymond='red', rachel='blue')
>>> d
{'rachel': 'blue', 'raymond': 'red'}
>>> d['critter'] = ('xyz', [1,2,3])
>>> d['critter']
('xyz', [1, 2, 3])
>>> len(d)
3
>>> list(d)
['rachel', 'raymond', 'critter']
>>> d.keys()
['rachel', 'raymond', 'critter']
>>> d.items()
[('rachel', 'blue'), ('raymond', 'red'), ('critter', ('xyz', [1, 2, 3]))]
>>> d.values()
['blue', 'red', ('xyz', [1, 2, 3])]

以上内容会将您的数据库保存在一个文件中。您可以像普通的 python 字典一样导航对象。由于值在单个字段中腌制，因此 sqlite 不会为您提供任何其他查询选项。其他平面文件存储也有类似的限制。如果您需要编写遍历层次结构的查询，请考虑改用 NoSQL 数据库。

【讨论】：

感谢您的回答。我不知道为什么我质疑使用 sqlite3，但这可能是因为我从来没有做过 db，如前所述。我什至不知道 Python SPL 中的 shelve 模块，但是 sqlite3 可以从 CLI 或 GUI 访问，并且可以使用简单的 sn-p 作为 Python 选择对象，我刚刚制作了这个 db 文件。感谢 Raymond 为您提供额外的 sn-p 服务

【解决方案2】：

对我来说闻起来像一个文档存储数据库。查看 CouchDB http://couchdb.apache.org/

【讨论】：