【问题标题】:Convert stored procedure to python pandas code?将存储过程转换为 python pandas 代码?
【发布时间】:2021-04-20 09:11:50
【问题描述】:

要创建的列使用列作者类型进行排名。

例子:

PMID Rank
200 3
201 0
200 0
202 0
200 2
201 1
200 1

预期:

PMID Rank Author_type
200 3 Last Author
201 0 First Author
200 0 First Author
202 0 First Author
200 2 Co Author
201 1 Last Author
200 1 Co Author
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [datacleaning].[pub_set_authors]
AS
BEGIN
    WITH cte AS 
    (
        SELECT 
            *,
            ROW_NUMBER() OVER (PARTITION BY [pub_id] ORDER BY [rank] DESC) AS rnk
        FROM 
            datacleaning.pubmed_details
    )
    UPDATE datacleaning.pubmed_details
    SET author_type = 'Last Author'
    WHERE row_id IN (SELECT row_id
                     FROM cte
                     WHERE rnk = 1)

    UPDATE datacleaning.pubmed_details
    SET author_type = 'First Author'
    WHERE rank = 0;

    UPDATE datacleaning.pubmed_details
    SET author_type = 'Co Author'
    WHERE author_type is NULL;
END

【问题讨论】:

    标签: python sql pandas dataframe


    【解决方案1】:

    这样就得到了上面的结果;要复制 sql 代码中的 row_number over partition 部分,您可以将 groupbycumcount 结合使用。下一步使用np.select,类似于多个case-when表达式:

    (df.assign(row_number = df.groupby("PMID").Rank.cumcount(), 
               Author_type = lambda df: np.select([df.Rank == 0, 
                                                   df.row_number.isin([0, 1]),  
                                                   df.row_number > 1], 
                                                  ['First Author', 
                                                   'Last Author', 
                                                   'Co Author'])
                )
      .drop(columns = 'row_number')
    )
    
    
    
       PMID  Rank   Author_type
    0   200     3   Last Author
    1   201     0  First Author
    2   200     0  First Author
    3   202     0  First Author
    4   200     2     Co Author
    5   201     1   Last Author
    6   200     1     Co Author
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-12-06
      • 2013-10-10
      • 2021-12-12
      • 2010-10-23
      • 2013-06-18
      • 1970-01-01
      相关资源
      最近更新 更多