Sqlite vs 基于文件的数据存储？答案

【问题标题】：Sqlite vs File based data storing?Sqlite vs 基于文件的数据存储？
【发布时间】：2011-05-03 21:30:11
【问题描述】：

假设我有一个像这样的类：

class User
  attr_accessor :name, :age
  def initialize(name, age)
    @name, @age = name, age
  end
end

现在，将用户保存为单个文件中的用户类的封送实例或使用带有 ORM 的 Sqlite 数据库会更快吗？基于文件的数据存储有哪些缺点？

【问题讨论】：

基准测试！ SQLite是基于文件的存储，它只是以特定的方式结构化。

标签： ruby

【解决方案1】：

以下是在 SSD 上执行的基准测试的结果。随心所欲地解释它们。对于非常简单的查询和数据，将整个数据集编组并加载到内存中似乎更快：

Rehearsal ---------------------------------------------------------------
Storing in DB                 0.080000   0.000000   0.080000 (  0.085909)
Marshalling to Disk           0.010000   0.000000   0.010000 (  0.004340)
Fetching marshal              0.000000   0.000000   0.000000 (  0.002288)
Fetching records from DB      5.530000   0.130000   5.660000 (  5.657053)
Fetching records from Array   0.350000   0.000000   0.350000 (  0.347798)
Find one record from DB       0.320000   0.020000   0.340000 (  0.336068)
Find one record from Array    0.260000   0.000000   0.260000 (  0.258766)
------------------------------------------------------ total: 6.700000sec

                                  user     system      total        real
Storing in DB                 0.080000   0.000000   0.080000 (  0.079717)
Marshalling to Disk           0.000000   0.000000   0.000000 (  0.002595)
Fetching marshal              0.000000   0.000000   0.000000 (  0.001466)
Fetching records from DB     10.830000   0.230000  11.060000 ( 11.041669)
Fetching records from Array   0.340000   0.000000   0.340000 (  0.335473)
Find one record from DB       0.320000   0.010000   0.330000 (  0.336917)
Find one record from Array    0.260000   0.000000   0.260000 (  0.255746)

这是基准：

require 'benchmark'
require 'sequel'
class User
  attr_reader :name, :age
  def initialize(name, age)
    @name, @age = name, age
  end
  def to_hash; {name:@name, age:@age}; end
end
db_array = 1000.times.map{ User.new "name#{rand 1000}", rand(1000) }
db_array << User.new( "unique", 42 )
DBFILE  = 'users.db'; MARSHAL = 'users.marshal'
File.delete(DBFILE) if File.exists?(DBFILE)
DB = Sequel.sqlite(DBFILE)
DB.create_table(:users){ column(:name,:string); column(:age,:int) }
db_users = DB[:users]
Benchmark.bmbm do |x|
  x.report('Storing in DB'){ db_users.multi_insert db_array.map(&:to_hash) }
  x.report('Marshalling to Disk'){ File.open(MARSHAL, 'w'){ |f| f << Marshal.dump(db_array) } }
  x.report('Fetching marshal'){ db_array = Marshal.load(File.open(MARSHAL,'r'){|f| f.read }) }
  query = db_users.select{ name > "name500" }
  x.report('Fetching records from DB'){ 1000.times{ query.all } }
  x.report('Fetching records from Array'){ 1000.times{ db_array.select{ |u| u.name > "name500" } } }
  x.report('Find one record from DB'){ 1000.times{ db_users[name:'unique'] } }
  x.report('Find one record from Array'){ 1000.times{ db_array.find{ |u| u.name == "unique" } } }
end

【讨论】：

上面的测试技术有缺陷。通过使用bmbm，测试运行两次；结果，在第二次运行时，数据库获得了第二组条目，使其成为数组的两倍。您应该只比较第一个“排练”阶段的数字，而不是第二阶段列出的数字。

【解决方案2】：

存储编组对象的缺点是您的编组数据可能与未来对 ruby 类的更改不兼容。因此，您最终可能会将 Hash 或 Array 等基本结构持久保存到文件中。如果你在那个时候，使用 SQLite 是更好的选择。

【讨论】：

【解决方案3】：

我认为这取决于您要执行的操作：如果您只想从文件中读取所有内容，而不执行搜索/选择单个实例等类似操作，则使用文件更好（您只需阅读并重建实例）。

如果您想要与级联读取不同的任何类型的访问，请使用数据库（它们经过程序优化，可以尽可能快地写入/读取文件，还允许该类型的操作；））

还有一个小问题需要考虑：我不知道 ruby 是如何执行和处理文件的（可能由于解析器的原因，从文件读取速度较慢），我想你可以在 ruby 论坛上问这个问题，但我想阅读文件从头到尾不会有问题

【讨论】：

【解决方案4】：

我会将 SQLite 与 DataMapper ORM (http://datamapper.org/) 一起使用。

我认为将您的用户存储在单个文件中会很难管理。使用 DataMapper 查询 SQLite 数据库非常简单。

【讨论】：