【问题标题】:Optimise ORDER BY in MYSQL Query优化 MYSQL 查询中的 ORDER BY
【发布时间】:2018-03-06 04:25:47
【问题描述】:

我有以下查询,它使用带限制的 order by。提取大约 16k 数据需要 2 分 25 秒。我也做了正确的索引,但仍然执行缓慢。仅应用 LIMIT 20 时也需要相同的时间。删除 ORDER BY 后,查询会在 17 秒内获取相同的数据。所有表都在 latin1 字符集中。请提出任何可能的解决方案。

SELECT 
a.customer,
a.division AS division,
a.noitaziraa_id AS noitaziraaId,
DATE_FORMAT(a.request_date, '%m/%d/%Y') AS RequestDate,
a.request_date AS RequestDateSort,
DATE_FORMAT(noita.date_of_birth, '%m/%d/%Y') AS dob,
noita.date_of_birth AS dobSort,
IF(
a.noita_type = 'Noita Stay',
a.length_of_stay,
NULL
) AS requestedDays,
IF(
a.noita_type = 'Noita Stay',
CONCAT_WS(
  ',',
  a.facility_provider_city,
  a.facility_provider_state
),
''
) AS facilityCityState,
IF(
a.noita_type = 'Noita Stay',
IFNULL(
  DATE_FORMAT(aips.admission_date, '%m/%d/%Y'),
  ''
),
''
) AS admitDate,
IF(
a.noita_type = 'Noita Stay',
aips.admission_date,
''
) AS admitDateSort,
IF(
a.noita_type = 'Noita Stay',
IFNULL(
  DATE_FORMAT(
    aipsd.discharge_date,
    '%m/%d/%Y'
  ),
  ''
),
''
 ) AS dischargeDate,
  IF(
a.noita_type = 'Noita Stay',
aipsd.discharge_date,
''
 ) AS dischargeDateSort,
  IF(
a.noita_type = 'Noita Stay',
IFNULL(dl1.`description`, ''),
''
 ) AS dischargeDisposition,
 a.gender,
 a.age,
  a.relationship AS relationship,
  noita.groupid,
  a.request_type AS requestType,
  a.prog_status AS programStatus,
 dl.description AS billingDetails,
 a.referred_to_npi AS NPI,
 a.program AS program,
 CASE
  WHEN a.status = 'OPEN' 
  THEN DATEDIFF(NOW(), a.auth_request_date) 
  ELSE 0 
  END AS 'daysSinceRequest',
  a.first_name AS firstName,
 a.last_name AS lastName,
dl2.description AS levelOfUrgency,
 a.member_id AS memberId,
a.created_full_name AS createdFullName,
CONCAT_WS(
',',
COALESCE(a.assigned_to, NULL),
COALESCE(
  a.auth_review_assigned_user_name,
  NULL
),
COALESCE(
  a.auth_con_review_assigned_user_name,
  NULL
),
COALESCE(a.assigned_queue, NULL),
COALESCE(
  a.auth_review_assigned_queue_name,
  NULL
 ),
 COALESCE(
  a.auth_con_review_assigned_queue_name,
  NULL
 )
 ) AS assignedTo,
a.status,
DATE_FORMAT(a.opened_date, '%m/%d/%Y') AS openDate,
 a.opened_date AS openDateSort,
 DATE_FORMAT(a.closed_date, '%m/%d/%Y') AS closedDate,
 a.closed_date AS closedDateSort,
 a.noita_type AS authType,
 a.facility_provider AS facilityProvider,
 a.length_of_stay AS lengthOfStay,
 DATE_FORMAT(a.requested_from, '%m/%d/%Y') AS authFromDate,
 a.requested_from AS authFromDateSort,
 DATE_FORMAT(a.requested_through, '%m/%d/%Y') AS authToDate,
 a.requested_through AS authToDateSort,
 a.pended,
  a.diagnosis AS diagnosis,
 a.diagnosis_desc AS diagDesc,
 a.auth,
a.denied,
a.excluded,
a.admit_type AS admitType,
a.service_type AS serviceType,
a.proc,
a.proc_desc AS procDesc,
a.plan 
FROM
main_table a 
INNER JOIN noitaciary noita 
ON noita.id = a.noitaciary_id 
INNER JOIN usermanagement.`user` usr 
ON a.created_by = usr.id 
AND 
CASE
  WHEN CONCAT(usr.firstname, ' ', usr.lastname) IN ('a', 'b *', 'c', 
   'd', 'd', 'f') 
  THEN 1 = 1 
  ELSE (
    COALESCE(usr.`employer`, '') NOT IN ('r', 's')
  ) 
  END 
    LEFT JOIN noitaziraa_ips AS aips 
    ON aips.noitaziraa_id = a.auth_id 
  LEFT JOIN db1.`noitaziraa_history` ah 
   ON ah.noitaziraa_id = a.noitaziraa_id 
 LEFT JOIN noitaziraa_ips_discharge AS aipsd 
  ON aipsd.noitaziraa_ips_id = aips.id 
 LEFT JOIN noitaziraa_phr AS aphr 
  ON aphr.noitaziraa_id = a.auth_id 
  LEFT JOIN noitaziraa_sp AS asp 
  ON asp.noitaziraa_id = a.auth_id 
  LEFT JOIN noitaziraa_decisions AS auth_dec 
 ON a.auth_id = auth_dec.noitaziraa_id 
 LEFT JOIN mytable AS aa 
 ON a.noitaziraa_id = aa.noitaziraa_id 
LEFT JOIN db1.dw_lookup dl 
 ON auth_dec.details = dl.code 
LEFT JOIN db1.`dw_lookup` dl1 
ON dl1.`code` = aipsd.`discharge_diposition` 
 AND dl1.`data_type` = 'dataTypeName' 
 LEFT JOIN db1.dw_lookup dl2 
 ON aa.level_of_urgency = dl2.code 
 AND dl2.data_type = 'dataTypeName1' 
LEFT JOIN 
    (SELECT 
     * 
   FROM
  (SELECT 
    hh.noitaziraa_id,
    hh.`status` 
  FROM
    db1.`noitaziraa_history` hh,
    main_table a 
  WHERE hh.noitaziraa_id = a.noitaziraa_id 
    AND hh.client = 'certainValue' 
    AND DATE(hh.last_updated) < '2017-12-01 00:00:00' 
  GROUP BY hh.`last_updated` 
  ORDER BY hh.last_updated DESC) tmp 
GROUP BY noitaziraa_id) AS tps 
ON tps.noitaziraa_id = a.noitaziraa_id 
  WHERE a.customer LIKE 'certainValue%' 
   AND a.status <> 'VOID' 

  AND DATE(auth_dec.requested_through) >= '2017-12-01 00:00:00' 
AND DATE(auth_dec.requested_through) <= '2017-12-05 00:00:00' 
AND DATE(a.opened_date) <= '2017-12-05 00:00:00' 
 AND (
 (
    DATE(ah.last_updated) BETWEEN '2017-12-01 00:00:00' 
  AND '2017-12-05 00:00:00' 
   AND ah.status IN (
    'OPEN',
    'CLOSED',
    'REOPENED',
    'CANCELED'
     )
    ) || (
    tps.noitaziraa_id = a.noitaziraa_id 
     AND tps.status IN (
    'OPEN',
    'CLOSED',
    'REOPENED',
    'CANCELED'
  )
   )
  ) 
  GROUP BY a.auth_id 
  ORDER BY groupid ASC 
  LIMIT 0, 20 

noitaziraa_history 表包含大量行,必须保持连接以满足我的要求,这需要花费大量时间。

使用 EXPLAIN 可以得到以下结果:

【问题讨论】:

  • 只是好奇 - COALESCE(x,NULL) 应该做什么?
  • 另外,您似乎有一个没有关联聚合函数的 GROUP BY 子句。这不会影响性能,但会挑战任何结果的真实性。
  • @Strawberry COALESCE(x,NULL) 如果 x 为 null,则应该返回 NULL,否则返回 x 的值。你可以试试 SELECT COALESCE(NULL,21);和 SELECT COALESCE(1,NULL);
  • @Strawberry 我已经在我的 java 代码的查询中附加了 HAVING 子句。 :)
  • 如果 x 为 null(否则 x 的值),x 也将返回 null。换句话说,你的 COALESCE 什么都不做。

标签: mysql select sql-order-by


【解决方案1】:

这需要逐步解决。

        SELECT  *
            FROM  
            (
                SELECT  hh.noitaziraa_id, hh.`status`
                    FROM  db1.`noitaziraa_history` hh, main_table a
                    WHERE  hh.noitaziraa_id = a.noitaziraa_id
                      AND  hh.client = 'certainValue'
                      AND  DATE(hh.last_updated) <  '2017-12-01 00:00:00'
                    GROUP BY  hh.`last_updated`
                    ORDER BY  hh.last_updated DESC
            ) tmp
            GROUP BY  noitaziraa_id

内部的ORDER BY 将被忽略;摆脱它。那就问两级GROUP BY是否真的有意义。

AND  DATE(hh.last_updated) <  '2017-12-01 00:00:00'

改成

AND hh.last_updated < '2017-12-01'

原因:在函数中隐藏潜在索引列 (DATE) 使其无法使用索引。

然后将此复合索引添加到hh

INDEX(client, noitaziraa_id, last_updated, status) 

同时,您可能有一个严重的错误:为什么在这个子查询和外部区域中都指定了main_table a?是不是搞错了?

      AND DATE(auth_dec.requested_through) >= '2017-12-01 00:00:00'
      AND DATE(auth_dec.requested_through) <= '2017-12-05 00:00:00'

-->

      AND auth_dec.requested_through >= '2017-12-01'
      AND auth_dec.requested_through  < '2017-12-01' + INTERVAL 5 DAY

这些没有被使用,所以摆脱它们。这可能需要您在构造查询的代码上投入更多精力。 (还是手写的?)

    LEFT JOIN  noitaziraa_phr AS aphr  ON aphr.noitaziraa_id = a.auth_id
    LEFT JOIN  noitaziraa_sp AS asp  ON asp.noitaziraa_id = a.auth_id

LEFT JOIN -- 除非你需要,否则不要使用它。您不需要其中的一些 - 可以通过在 WHERE 子句中引用 auth_dec 来发现。

dl, dl1, dl2 -- 这些位于LEFT JOINs 链的末端。删除它们,并删除对其中列的引用。然后在外面添加一个额外的SELECT 层,以便在执行ORDER BYLIMIT 之后 进入它们。这会将对它们的引用数量从“很多”减少到只有 20 个。

EXPLAIN 显示一个表caseload;查询没有这样的。请修复。

并修正AND AND 的错字。

我暂时退出。

【讨论】:

  • @BenHuman - 我看到 AND AND 如果已修复;你试过其他的吗?
  • 是的,当我删除介于两者之间的 AND 条件时,这是一个错字,由于 PHI 问题,没有必要在此处发布。在“caseload”的事情中,实际上我在发布查询之前已经在这里更改了表名,所以它不匹配。我还尝试添加复合索引,结果仍然相同。 :(但仍然非常感谢您为此付出的努力。我真的很感激。
【解决方案2】:

根据问题,您也应用了正确的索引。我想你是对的。然后,请避免将 LEFT JOIN 与noitaziraa_history 表一起使用,因为@Rick James 也提到了它。如果可能,请确保在此表中加载数据,以便主表中的所有 noitaziraa_id 也在历史表中。现在,您可以应用 INNER JOIN 而不是您正在使用的 LEFT JOIN 并查看结果。此外,按照 Rick James 的建议,重构目前看来不合适或无用的所有内容。如果主表中的同一行有多行数据,我确信 INNER JOIN 将减少 LEFT JOIN 所花费的时间。 还有一件事,如果您可以在与nnoitaziraa_history 表的 JOIN 期间应用任何条件过滤器,就像您在下面的子查询中所做的那样:

    INNER JOIN db1.`noitaziraa_history` ah 
    ON ah.noitaziraa_id = a.noitaziraa_id AND  hh.client = 'certainValue' AND  DATE(hh.last_updated) <  '2017-12-01 00:00:00'

如果它适合你,请更新 :) 谢谢!

【讨论】:

  • 哦!我没有想到这一点,希望它有效,如果有效我会更新。 :) @Rama
  • 非常感谢!!我用丢失的过去数据修补了历史表并应用了 INNER JOIN。我还在mytable 中添加了 INNER JOIN,现在只需 20 秒即可返回 16k 的结果,这仍然很慢,但进步很大。再一次感谢你。 :) 再次感谢@Rick James
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2011-05-13
  • 2012-06-12
  • 2012-08-31
  • 1970-01-01
  • 2012-12-31
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多