【问题标题】:Get "rank" of badge on SO for my user - query is slow - speedup possible?为我的用户在 SO 上获得徽章的“等级” - 查询速度很慢 - 可以加速吗?
【发布时间】:2019-11-03 05:12:18
【问题描述】:

我很好奇有多少人在我之前得到了 - 我能够得到这些信息

python    2019-01-02 09:09:15   Gold    454

有了这个(运行缓慢)query:

(我无法在数据浏览器上与我的主要用户单次/交叉登录,因此匿名登录)

-- insert your user id here:
declare @uid int = 7505395

-- get all badges of all users
select Name, Date, [Gold/Silver/Else], [Row#] from ( 
  SELECT Name, 
         Date, 
         userId,
         case when class = 1 then 'Gold'
              when class = 2 then 'Silver'
              when class = 3 then 'Bronze'
              else convert(varchar(10), class)
              end as 'Gold/Silver/Else',
              ROW_NUMBER() OVER(PARTITION BY name, class ORDER BY date ASC) AS Row# 
  FROM badges
  WHERE 1 = 1
    -- you can restrict this further, f.e. for looking only by gold badges
    -- and Class = 1  -- gold == 1, silver == 2, bronze == 3
    -- -- or for certain named badges
    -- and name like 'python%' 
) as tmp
where userID = @uid 
ORDER by name asc, Date asc

(按原样查询给了我所有的徽章,有多少在我之前得到它,并且必须对所有可能的徽章进行排序)

问题:

我尝试了 CTE(只有错误,没有工作)并且我的 sql 技能生疏了 - 如何加速这个查询?

【问题讨论】:

    标签: sql performance badge dataexplorer


    【解决方案1】:

    问题是表格似乎没有对此有用的索引。我们得到如下执行计划:

    -- 索引扫描不是最理想的。我们想要索引搜索。

    不过,您可以通过以下方式将时间缩短近一半:

    1. 预选用户的徽章。
    2. 对排名使用相关子查询。
    3. 使用Id 作为Date 的代理。 (Ids 是独一无二的,不断增加,而且通常可以更快地进行排序。)

    另请注意:

    1. the magic ##UserId:INT## parameter的使用。
    2. Class 列只有 3 个值。
    3. 您可以通过省略ORDER BY 子句将查询时间再缩短几秒钟。

    无论如何,this query 表现更好:

    WITH zUsersBadges AS (
        SELECT  b.Id
                , b.UserId
                , b.Name
                , b.Date
                , b.Class
                , [Badge Class] = (
                    CASE    WHEN b.Class = 1 THEN 'Gold'
                            WHEN b.Class = 2 THEN 'Silver'
                            WHEN b.Class = 3 THEN 'Bronze'
                    END
                )
                , [Is tag badge] = IIF (b.TagBased = 1, 'Yes', 'No')
        FROM    Badges b
        WHERE   b.UserId = ##UserId:INT##
    )
    SELECT      ub.Name                 AS [Badge Name]
                , ub.[Badge Class]
                , ub.[Is tag badge]
                , ub.Date               AS [Date Earned]
                , [In Top N of earners] = (
                    SELECT  COUNT (ob.ID)
                    FROM    Badges ob
                    WHERE   (ob.Name = ub.Name  AND  ob.Class = ub.Class  AND  ob.Id <= Ub.Id)  -- Faster but may give slightly higher rank
                    --WHERE   (ob.Name = ub.Name  AND  ob.Class = ub.Class  AND  ob.Date <= Ub.Date)  -- Slower, but gives exact rank.
                )
    FROM        zUsersBadges ub
    ORDER BY    ub.Name, ub.Date
    

    更新:This query 表现更好,因为它聚合了多次获得的徽章:

    WITH zUsersBadges AS (
        SELECT      b.UserId
                    , b.Name
                    , minId = MIN (b.Id)
                    , [First Earned] = MIN (b.Date)
                    , [Earned N times] = COUNT (b.Date)
                    , b.Class
                    , [Badge Class] = (
                        CASE    WHEN b.Class = 1 THEN 'Gold'
                                WHEN b.Class = 2 THEN 'Silver'
                                WHEN b.Class = 3 THEN 'Bronze'
                        END
                    )
                    , [Is tag badge] = IIF (b.TagBased = 1, 'Yes', 'No')
        FROM        Badges b
        WHERE       b.UserId = ##UserId:INT##
        GROUP BY    b.UserId, b.Class, b.Name, b.TagBased
    )
    SELECT      ub.Name                 AS [Badge Name]
                , ub.[Badge Class]
                , ub.[Is tag badge]
                , ub.[First Earned]
                , ub.[Earned N times]
                , [In Top N of earners] = (
                    SELECT  COUNT (ob.ID)
                    FROM    Badges ob
                    WHERE   (ob.Class = ub.Class  AND  ob.Id <= Ub.minId  AND  ob.Name = ub.Name)  -- Faster but may give slightly higher rank
                    --WHERE   (ob.Class = ub.Class  AND  ob.Date <= Ub.[First Earned]  AND  ob.Name = ub.Name)  -- Faster but may give slightly higher rank
                )
    FROM        zUsersBadges ub
    ORDER BY    ub.Name, ub.[First Earned]
    

    【讨论】:

    • 哇 - 谢谢你的回答。魔术参数对我来说是新的,这是我对 SO 的第一个“真实”数据查询。你甚至让 cte 工作......我的没有
    • @不客气。不要沉迷于 CTE,它们只是一种工具,有时会被滥用。而且解决问题的方法总是不止一种。
    • @PatrickArtner,请参阅更新后的答案。通过聚合重复的徽章,您可以获得更好的性能。
    【解决方案2】:

    您可以将聚合与过滤一起使用:

    select count(*)
    from badges b
    where b.name = 'python' and b.class = 2 and
          b.date < (select b2.date
                    from badges b2
                    where b2.name = 'python' and b2.class = 2 and
                          b2.userID = @uid 
                   );
    

    【讨论】:

      猜你喜欢
      • 2011-01-17
      • 2019-06-28
      • 1970-01-01
      • 2023-03-13
      • 1970-01-01
      • 2018-11-16
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多