【问题标题】:Finding the most frequent value in sql server 2012在 sql server 2012 中查找最常见的值
【发布时间】:2015-02-08 12:11:43
【问题描述】:

我想找到每位客户最常购买的产品。我的数据集是这样的:

CustomerID     ProdID    FavouriteProduct
    1              A              ?
    1              A              ?
    1              A              ?
    1              B              ?
    1              A              ?
    1              A              ?
    1              A              ?
    1              B              ?
    2              A              ?
    2              AN             ?
    2              G              ?
    2              C              ?
    2              C              ?
    2              F              ?
    2              D              ?
    2              C              ?

产品太多了,我无法将它们放在数据透视表中。

答案如下所示:

CustomerID     ProdID    FavouriteProduct
    1              A              A
    1              A              A
    1              A              A
    1              B              A
    1              A              A
    1              A              A
    1              A              A
    1              B              A
    2              A              C
    2              AN             C
    2              G              C
    2              C              C
    2              C              C
    2              F              C
    2              D              C
    2              C              C

查询可能如下所示:

Update table
set FavouriteProduct = (Select 
                            CustomerID, Product, Max(Count(Product)) 
                        From Table 
                        group  by CustomerID, Product) FP    

【问题讨论】:

  • Pivot 与此无关。首先计算出返回每个客户最喜欢的产品的查询。您快到了。然后我们可以帮助更新。
  • @Nick.McDermaid -我知道,我只是说如果产品数量是三四个,我们可以通过数据透视表轻松找到最喜欢的产品。但是现在呢?
  • 转到本页底部sql-server-performance.com/2006/find-frequent-values,看看您是否可以调整 SQL 以返回所有客户及其喜爱产品的列表。
  • 谢谢你! @Nick.McDermaid
  • 您的问题解决了吗?您需要进一步的帮助吗?

标签: sql sql-server sql-server-2012 pivot


【解决方案1】:

获得最频繁产品的另一种方法是使用row_number()

select customerid, productid,
       max(case when seqnum = 1 then productid end) over (partition by customerid) as favoriteproductid
from (select customerid, productid, count(*) as cnt,
             row_number() over (partition by customerid order by count(*) desc) as seqnum
      from customer c
      group by customerid, productid
     ) cp;

【讨论】:

    【解决方案2】:

    要完全按照您在问题中的描述返回行,您可以尝试使用表表达式(我在示例中使用 CTE)首先返回受欢迎程度排名,其中数字越高,产品越受欢迎每个客户。

    WITH RankTable AS (
      SELECT
        CustomerID, ProductID, COUNT(*) AS Popularity
      FROM TableA
      GROUP BY CustomerID, ProductID
    )
    

    然后可以通过首先对原始表 (TableA) 和表表达式 (RankTable) 执行内连接,然后使用窗口函数在 FavoriteProduct 列中创建值来返回完整的结果表。

    SELECT 
        P.CustomerID
      , P.ProductID
      , FIRST_VALUE(P.ProductID) OVER(
          PARTITION BY R.CustomerID
          ORDER BY R.Popularity DESC, R.ProductID) AS FavoriteProduct
    FROM TableA AS P
      INNER JOIN RankTable AS R
        ON P.CustomerID = R.CustomerID
        AND P.ProductID= R.ProductID;
    

    【讨论】:

      【解决方案3】:

      感谢尼克,我找到了一种找到最常见值的方法。我与你分享它是如何工作的:

         Select CustomerID,ProductID,Count(*) as Number 
         from table A 
         group by CustomerID,ProductID 
         having Count(*)>= (Select Max(Number) from (Select CustomerID,ProductID,Count(*) as Number from table B where B.CustomerID= A.CustomerID  group by CustomerID,Product)C) 
      

      【讨论】:

        【解决方案4】:

        以防万一您的 SQL 执行速度不够快,并且您的客户也在较小的表中,这可能会更好::

        select C.CustomerId, R.ProductID
        from Customer C
        outer apply (
          Select top 1 ProductID,Count(*) as Number 
          from table A 
          where A.CustomerId = C.CustomerId
          group by ProductId
          order by Number desc 
        ) R
        

        【讨论】:

          【解决方案5】:

          这个,基于本页末尾的示例:http://www.sql-server-performance.com/2006/find-frequent-values/ 可能更快:

          SELECT CustomerID, ProdID, Cnt 
          FROM 
          (
              SELECT CustomerID, ProdID, COUNT(*) as Cnt, 
              RANK() OVER (
                 PARTITION BY CustomerID
                 ORDER BY COUNT(*) DESC
              ) AS Rnk 
              FROM YourTransactionTable
              GROUP BY CustomerID, ProdID
          ) x 
          WHERE Rnk = 1
          

          这个使用RANK() 函数。在这种情况下,您不必重新加入同一张表(这意味着所需的工作要少得多)

          现在要更新您现有的数据,我喜欢将我的数据集包装在 WITH 中,以使调试更容易,最终更新更简单:

          ;WITH
          (
            SELECT CustomerID, ProdID, Cnt 
            FROM 
            (
               SELECT CustomerID, ProdID, COUNT(*) as Cnt, 
               RANK() OVER (PARTITION BY CustomerID
               ORDER BY COUNT(*) DESC) AS Rnk 
               FROM TransactionTable
               GROUP BY CustomerID, ProdID
            ) x 
            WHERE Rnk = 1
          ) As SRC
          
           UPDATE FavouriteTable
           SET Favourite = SRC.ProdID
           FROM SRC
           WHERE SRC.CustomerID = Favourite.CustomerID
          

          【讨论】:

            猜你喜欢
            • 2012-08-27
            • 1970-01-01
            • 2014-08-08
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2017-02-25
            • 2013-09-05
            相关资源
            最近更新 更多