【问题标题】:Query's result set is too big查询的结果集太大
【发布时间】:2011-05-05 15:39:59
【问题描述】:

我的查询可能快或慢,具体取决于我要获取的记录数。这是一个表格,显示了我的LIMIT 子句中的数字以及执行查询和获取结果所需的相应时间:

LIMIT | Seconds (Duration/Fetch)
------+-------------------------
   10 |  0.030/  0.0
  100 |  0.062/  0.0
 1000 |  1.700/  0.8
10000 | 25.000/100.0

如您所见,达到至少 1,000 是可以的,但 10,000 确实很慢,主要是由于获取时间较长。我不明白为什么获取时间的增长不是线性的,但我 am 从 70 多个表中抓取了 200 多个列,因此结果集需要很长时间才能获取的事实并不是一个惊喜。

顺便说一下,我要获取的是某家银行所有账户的数据。我正在处理的银行有大约 160,000 个帐户,因此我最终需要从数据库中获取 160,000 行。

尝试一次获取 160,000 行显然是不可行的(至少不可行,除非我能以某种方式显着优化我的查询)。在我看来,我可以合理抓取的最大块是 1,000 行,因此我编写了一个脚本,该脚本将使用SELECT INTO OUTFILE、限制和偏移来一遍又一遍地运行查询。然后,最后,我将我转储的所有 CSV 文件和cat 放在一起。它有效,但速度很慢。这需要几个小时。我现在正在运行脚本,它在大约一个小时内只转储了 43,000 行。

我应该在查询优化级别解决这个问题,还是较长的获取时间表明我应该将注意力集中在其他地方?你会建议我做什么?

如果您想查看查询,可以查看here

【问题讨论】:

  • 请显示解释选择...
  • 老兄,你真的想从 sql 查询中的字段中删除银行名称。
  • 您有三个优秀的答案值得选择。您从未回答过 Karelzarath 的关键问题 “查询必须真正返回 215 个字段中的每一个,并使用 29 个连接吗?” 为什么不想出一个减少了 30 个字段的查询,然后至少你可以衡量您添加的每个联接的性能如何下降。另外,你在 STRAIGHT_JOIN DRapp 上给你的表现数字是多少?
  • 我不知道正确答案是什么,我不再从事这个项目。对不起。

标签: mysql sql into-outfile


【解决方案1】:

答案在很大程度上取决于您对数据的处理方式。通过 29 个连接查询 215 列对于非平凡的记录大小永远不会很快。

如果您尝试向用户显示 160,000 条记录,您应该对结果进行分页并且一次只获取一页。这将使结果集保持足够小,即使是相对低效的查询也会很快返回。在这种情况下,您还需要检查用户需要多少数据才能选择或操作数据。很有可能您可以将其缩减为少数字段和一些聚合(计数、总和等),这将使用户能够就他们想要使用哪些记录做出明智的决定。使用带有偏移量的LIMIT 来拉取任意大小的单页。

如果您需要出于报告目的导出数据,请确保您只提取报告所需的确切数据。尽可能消除连接,并在需要汇总子数据的地方使用子查询。您需要为经常使用的连接和条件调整/添加索引。对于您提供的查询,ib.id 和您正在加入的无数外键。您可以不使用布尔列,因为没有足够的不同值来形成有意义的索引。

无论您想要完成什么,删除一些连接和列都会加速您的处理。 MySQL 填充该查询所需的繁重工作量是您的主要绊脚石。

【讨论】:

    【解决方案2】:

    我已对您的查询进行了重组,希望能够显着提高性能。通过使用 STRAIGHT_JOIN 告诉 MySQL 按照您所说的顺序进行操作(或者我在这里进行了调整)。最里面的第一个查询“PreQuery”别名从导入包和通用导入的标准开始,到帐户导入到帐户...通过在此处预先应用 WHERE 子句(并且正如您将测试的那样,添加您的此处限制条款)您正在预先加入这些表格,并在浪费任何时间尝试获取客户、地址等其他信息之前将它们排除在外。在查询中,我调整了联接/左联接,以更好地显示底层链接表的关系(主要供其他人阅读)。

    正如另一个人指出的那样,我在 PREQUERY 中所做的可能是用于通过和页面可用的主预查询列表中的“Account.ID”记录的基础。我很想知道这对您现有的性能,尤其是在 10,000 限制范围内的性能。

    PREQUERY 获取唯一元素(包括下游使用的帐户 ID、银行、月份、年份和类别),因此在加入过程的其余部分不必重新加入这些表。

    SELECT STRAIGHT_JOIN
          PreQuery.*,
          customer.customer_number,
          customer.name,
          customer.has_bad_address,
          address.line1,
          address.line2,
          address.city,
          state.name,
          address.zip,
          po_box.line1,
          po_box.line2,
          po_box.city,
          po_state.name,
          po_box.zip,
          customer.date_of_birth,
          northway_account.cffna,
          northway_account.cfinsc,
          customer.deceased,
          customer.social_security_number,
          customer.has_internet_banking,
          customer.safe_deposit_box,
          account.has_bill_pay,
          account.has_e_statement,
          branch.number,
          northway_product.code,
          macatawa_product.code,
          account.account_number,
          account.available_line,
          view_macatawa_atm_card.number,
          view_macatawa_debit_card.number,
          uc.code use_class,
          account.open_date,
          account.balance,
          account.affinion,
          northway_account.ytdsc,
          northway_account.ytdodf,
          northway_account.ytdnsf,
          northway_account.rtckcy,
          northway_account.rtckwy,
          northway_account.odwvey,
          northway_account.ytdscw,
          northway_account.feeytd,
          customer.do_not_mail,
          northway_account.aledq1,
          northway_account.aledq2,
          northway_account.aledq3,
          northway_account.aledq4,
          northway_account.acolq1,
          northway_account.acolq2,
          northway_account.acolq3,
          northway_account.acolq4,
          o.officer_number,
          northway_account.avg_bal_1,
          northway_account.avg_bal_2,
          northway_account.avg_bal_3,
          account.maturity_date,
          account.interest_rate,
          northway_account.asslc,
          northway_account.paidlc,
          northway_account.lnuchg,
          northway_account.ytdlc,
          northway_account.extfee,
          northway_account.penamt,
          northway_account.cdytdwaive,
          northway_account.cdterm,
          northway_account.cdtcod,
          account.date_of_last_statement,
          northway_account.statement_cycle,
          northway_account.cfna1,
          northway_account.cfna2,
          northway_account.cfna3,
          northway_account.cfna4,
          northway_account.cfcity,
          northway_account.cfstate,
          northway_account.cfzip,
          northway_account.actype,
          northway_account.sccode,
          macatawa_account.account_type_code,
          macatawa_account.account_type_code_description,
          macatawa_account.advance_code,
          macatawa_account.amount_last_advance,
          macatawa_account.amount_last_payment,
          macatawa_account.available_credit,
          macatawa_account.balance_last_statement,
          macatawa_account.billing_day,
          macatawa_account.birthday_3,
          macatawa_account.birthday_name_2,
          macatawa_account.ceiling_rate,
          macatawa_account.class_code,
          macatawa_account.classified_doubtful,
          macatawa_account.classified_loss,
          macatawa_account.classified_special,
          macatawa_account.classified_substandard,
          macatawa_account.closed_account_flag,
          macatawa_account.closing_balance,
          macatawa_account.compounding_code,
          macatawa_account.cost_center_full,
          macatawa_account.cytd_aggregate_balance,
          macatawa_account.cytd_amount_of_advances,
          macatawa_account.cytd_amount_of_payments,
          macatawa_account.cytd_average_balance,
          macatawa_account.cytd_average_principal_balance,
          macatawa_account.cytd_interest_paid,
          macatawa_account.cytd_number_items_nsf,
          macatawa_account.cytd_number_of_advanes,
          macatawa_account.cytd_number_of_payments,
          macatawa_account.cytd_number_times_od,
          macatawa_account.cytd_other_charges,
          macatawa_account.cytd_other_charges_waived,
          macatawa_account.cytd_reporting_points,
          macatawa_account.cytd_service_charge,
          macatawa_account.cytd_service_charge_waived,
          macatawa_account.date_closed,
          macatawa_account.date_last_activity,
          macatawa_account.date_last_advance,
          macatawa_account.date_last_payment,
          macatawa_account.date_paid_off,
          macatawa_account.ddl_code,
          macatawa_account.deposit_rate_index,
          macatawa_account.employee_officer_director_full_desc,
          macatawa_account.floor_rate,
          macatawa_account.handling_code,
          macatawa_account.how_paid_code,
          macatawa_account.interest_frequency,
          macatawa_account.ira_plan,
          macatawa_account.load_rate_code,
          macatawa_account.loan_rate_code,
          macatawa_account.loan_rating_code,
          macatawa_account.loan_rating_code_1_full_desc,
          macatawa_account.loan_rating_code_2_full_desc,
          macatawa_account.loan_rating_code_3_full_desc,
          macatawa_account.loan_to_value_ratio,
          macatawa_account.maximum_credit,
          macatawa_account.miscellaneous_code_full_desc,
          macatawa_account.months_to_maturity,
          macatawa_account.msa_code,
          macatawa_account.mtd_agg_available_balance,
          macatawa_account.naics_code,
          macatawa_account.name_2,
          macatawa_account.name_3,
          macatawa_account.name_line,
          macatawa_account.name_line_2,
          macatawa_account.name_line_3,
          macatawa_account.name_line_1,
          macatawa_account.net_payoff,
          macatawa_account.opened_by_responsibility_code_full,
          macatawa_account.original_issue_date,
          macatawa_account.original_maturity_date,
          macatawa_account.original_note_amount,
          macatawa_account.original_note_date,
          macatawa_account.original_prepaid_fees,
          macatawa_account.participation_placed_code,
          macatawa_account.participation_priority_code,
          macatawa_account.pay_to_account,
          macatawa_account.payment_code,
          macatawa_account.payoff_principal_balance,
          macatawa_account.percent_participated_code,
          macatawa_account.pmtd_number_deposit_type_1,
          macatawa_account.pmtd_number_deposit_type_2,
          macatawa_account.pmtd_number_deposit_type_3,
          macatawa_account.pmtd_number_type_1,
          macatawa_account.pmtd_number_type_2,
          macatawa_account.pmtd_number_type_6,
          macatawa_account.pmtd_number_type_8,
          macatawa_account.pmtd_number_type_9,
          macatawa_account.principal,
          macatawa_account.purpose_code,
          macatawa_account.purpose_code_full_desc,
          macatawa_account.pytd_number_of_items_nsf,
          macatawa_account.pytd_number_of_times_od,
          macatawa_account.rate_adjuster,
          macatawa_account.rate_over_split,
          macatawa_account.rate_under_split,
          macatawa_account.renewal_code,
          macatawa_account.renewal_date,
          macatawa_account.responsibility_code_full,
          macatawa_account.secured_unsecured_code,
          macatawa_account.short_first_name_1,
          macatawa_account.short_first_name_2,
          macatawa_account.short_first_name_3,
          macatawa_account.short_last_name_1,
          macatawa_account.short_last_name_2,
          macatawa_account.short_last_name_3,
          macatawa_account.statement_cycle,
          macatawa_account.statement_rate,
          macatawa_account.status_code,
          macatawa_account.tax_id_number_name_2,
          macatawa_account.tax_id_number_name_3,
          macatawa_account.teller_alert_1,
          macatawa_account.teller_alert_2,
          macatawa_account.teller_alert_3,
          macatawa_account.term,
          macatawa_account.term_code,
          macatawa_account.times_past_due_01_29,
          macatawa_account.times_past_due_01_to_29_days,
          macatawa_account.times_past_due_30_59,
          macatawa_account.times_past_due_30_to_59_days,
          macatawa_account.times_past_due_60_89,
          macatawa_account.times_past_due_60_to_89_days,
          macatawa_account.times_past_due_over_90,
          macatawa_account.times_past_due_over_90_days,
          macatawa_account.tin_code_name_1,
          macatawa_account.tin_code_name,
          macatawa_account.tin_code_name_2,
          macatawa_account.tin_code_name_3,
          macatawa_account.total_amount_past_due,
          macatawa_account.waiver_od_charge,
          macatawa_account.waiver_od_charge_description,
          macatawa_account.waiver_service_charge_code,
          macatawa_account.waiver_transfer_advance_fee,
          macatawa_account.short_first_name,
          macatawa_account.short_last_name          
    FROM
       ( SELECT STRAIGHT_JOIN DISTINCT
             b.name bank,
             ib.YEAR,
             ib.MONTH,
             ip.category,
             Account.ID
             FROM import_bundle ib 
                JOIN generic_import gi ON ib.id = gi.import_bundle_id
                   JOIN account_import AI ON gi.id = ai.generic_import_id
                      JOIN Account ON AI.ID = account.account_import_id 
                   JOIN import_profile ip ON gi.import_profile_id = ip.id
                JOIN bank b ib.Bank_ID = b.id
          WHERE
                IB.ID = 95
             AND IB.Active = 1
             AND GI.Active = 1
          LIMIT 1000 ) PreQuery
       JOIN Account on PreQuery.ID = Account.ID
          JOIN Customer on Account.Customer_ID = Customer.ID
          JOIN Officer on Account.Officer_ID = Officer.ID
          LEFT JOIN branch ON Account.branch_id = branch.id
          LEFT JOIN cd_type ON account.cd_type_id = cd_type.id
          LEFT JOIN use_class uc ON account.use_class_id = uc.id
          LEFT JOIN account_type at ON account.account_type_id = at.id
          LEFT JOIN northway_account ON account.id = northway_account.account_id
          LEFT JOIN macatawa_account ON account.id = macatawa_account.account_id
          LEFT JOIN view_macatawa_debit_card ON account.id = view_macatawa_debit_card.account_id
          LEFT JOIN view_macatawa_atm_card ON account.id = view_macatawa_atm_card.account_id
          LEFT JOIN original_address OA ON Account.ID = OA.account_id
    
          JOIN Account_Address AA ON Account.ID = AA.account_id
             JOIN address ON AA.address_id = address.id
                JOIN state ON address.state_id = state.id
    
          LEFT JOIN Account_po_box APB ON Account.ID = APB.account_id
             LEFT JOIN address po_box ON APB.address_id = po_box.id
                LEFT JOIN state po_state ON po_box.state_id = po_state.id
    
          LEFT JOIN Account_macatawa_product amp ON account.id = amp.account_id
             LEFT JOIN macatawa_product ON amp.macatawa_product_id = macatawa_product.id
                LEFT JOIN product_type pt ON macatawa_product.product_type_id = pt.id
                LEFT JOIN harte_hanks_service_category hhsc ON macatawa_product.harte_hanks_service_category_id = hhsc.id
                LEFT JOIN core_file_type cft ON macatawa_product.core_file_type_id = cft.id 
    
          LEFT JOIN Account_northway_product anp ON account.id = anp.account_id
             LEFT JOIN northway_product ON anp.northway_product_id = northway_product.id
    

    【讨论】:

      【解决方案3】:

      获取时间的非线性增加可能是由于键缓冲区已满,也可能是其他与内存相关的问题。您应该使用 EXPLAIN 优化查询以最大限度地利用索引,并调整您的 MySQL 服务器设置。

      【讨论】:

      • 好的,所以也许我应该将 MySQL 配置为使用尽可能多的内存,以便在合理的时间内获取大量行?
      • 调优 MySQL 是一个棘手的课题。我会尝试调整各种相关设置并在您进行过程中记录您的发现,直到您达到令人满意的改进程度。
      猜你喜欢
      • 2020-08-06
      • 2010-10-24
      • 1970-01-01
      • 2015-05-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多