【问题标题】:ActiveRecord : Delete duplicate recordsActiveRecord : 删除重复记录
【发布时间】:2012-05-11 16:29:51
【问题描述】:

如果在特定日期有多个记录,我想删除当天除最新记录之外的所有记录。例如,在下表中,id 为 9、10、12 的记录具有相同的日期。所以 9 和 10 应该被删除,因为 id 为 12 的记录是最新日期。

id      date
1   2012-04-25 00:00:00.000000
2   2012-04-26 00:00:00.000000
3   2012-04-23 00:00:00.000000
4   2012-04-24 00:00:00.000000
5   2012-05-01 00:00:00.000000
6   2012-05-02 00:00:00.000000
7   2012-05-03 00:00:00.000000
8   2012-05-04 00:00:00.000000
9   2012-04-30 00:30:00.000000
10  2012-04-30 18:00:00.000000
11  2012-04-29 00:00:00.000000
12  2012-04-30 18:40:00.000000
13  2012-05-05 00:00:00.000000
14  2012-05-05 09:31:31.000000

这是删除重复项的(脏)rake 任务

task :remove_duplicate do
  Rake::Task["remove_duplicate"].invoke
end

task :remove_duplicate => :environment do
  weights = Weight.count(:group => "DATE(date)", :having => "COUNT(id) > 1")
  weights_to_delete = []
  weights.each do |weight|

    start_date = weight[0].to_date.beginning_of_day
    end_date = weight[0].to_date.end_of_day
    day_weights = Weight.where("date >= ? and date <= ?", start_date, end_date).order(:date)
    day_weights[0..-2].each do |weight|
      weights_to_delete.push weight.id
    end
  end
  Weight.delete(weights_to_delete)
end

虽然我能够按照我的说明删除记录,但我对我采用的方法并不满意。请指导我删除特定日期的重复记录,仅使用 ActiveRecord API 更好地保持最新记录。

谢谢,阿米特·帕特尔

【问题讨论】:

    标签: mysql ruby-on-rails-3 activerecord sqlite rake-task


    【解决方案1】:

    这种方法可能会很慢,所以我不推荐它,除非你定期运行它。

    Weight.all.each do |weight|
      Weight.order("id desc").where(date: weight.date).all.drop(1).each { |w| w.delete }
    end
    

    【讨论】:

    • 缓慢而稳定,但它完成了工作,当用作一次性操作时,我更喜欢在速度之前阅读清晰易懂的代码。
    【解决方案2】:

    试试这个:

    latest_daily_weights = (Weight.maximum :date, :group => 'DATE(date)').values
    weights_table = Arel::Table.new(:weights)
    earlier_daily_weights = Weight.where(weights_table[:date].not_in latest_daily_weights)
    earlier_daily_weights.delete_all
    

    学分:

    How to exclude an array of ids from query in Rails (using ActiveRecord)?

    【讨论】:

      【解决方案3】:

      您可以尝试这个 sql 查询,以删除同一日期但该日期最新的记录

      DELETE FROM weights USING weights weight WHERE (CAST(weights.date as Date) = CAST(weight.date as Date) AND weights.id < weight.id);
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2021-03-17
        • 1970-01-01
        • 2016-01-07
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多