【问题标题】:How to make a pivot table in SAS如何在 SAS 中制作数据透视表
【发布时间】:2020-08-31 06:18:50
【问题描述】:

我有下表

+--------+----------+-----------+
| ID     | var_name | var_value |
+--------+----------+-----------+
| 153879 | age      | 35        |
+--------+----------+-----------+
| 153879 | gender   | Male      |
+--------+----------+-----------+
| 153879 | income   | 1000      |
+--------+----------+-----------+
| 13527  | age      | 18        |
+--------+----------+-----------+
| 13527  | gender   | Male      |
+--------+----------+-----------+
| 13527  | income   | 20        |
+--------+----------+-----------+
| 14416  | age      | 40        |
+--------+----------+-----------+
| 14416  | gender   | Female    |
+--------+----------+-----------+
| 14416  | income   | 500       |
+--------+----------+-----------+

如何制作数据透视表,使结果看起来像这样

+--------+-----+--------+--------+
| ID     | age | gender | income |
+--------+-----+--------+--------+
| 153879 | 35  | Male   | 1000   |
+--------+-----+--------+--------+
| 13527  | 18  | Male   | 20     |
+--------+-----+--------+--------+
| 14416  | 40  | Female | 500    |
+--------+-----+--------+--------+

我尝试通过以这种方式反复左加入表格来做到这一点:

data table;
    input id $ var_name $ var_value $;
datalines;
153879 age 35
153879 gender Male
153879 income 1000
13527  age 18
13527  gender Male
13527  income 20
14416  age 40
14416  gender Female
14416  income 500
run;


Proc SQL;
    Create table have as
    Select a.ID
          ,b.var_value as age
          ,c.var_value as gender
          ,d.var_value as income
    From (select distinct id from table) as a
    Left Join (select * from table where var_name = 'age') as b
        On a.id = b.id
    Left Join (select * from table where var_name = 'gender') as c
        On a.id = c.id
    Left Join (select * from table where var_name = 'income') as d
        On a.id = d.id;
Quit;

使用此代码我得到了所需的结果,但是,由于实际表要大几倍,并且包含 50 多个变量,我想知道是否有任何其他方法可以做到这一点,更有效和更短。

【问题讨论】:

    标签: sas pivot-table


    【解决方案1】:

    PROC TRANSPOSE 转置您的表格。您首先需要按id 对其进行排序或索引,如果idvar_name 有任何重复值,则预先聚合这些值。

    proc sql noprint;
        create table have_aggregated as 
            select id
                 , var_name
                 , sum(var_value) as var_value
            from have
            group by id, var_name
            order by id, var_name
        ;
    quit;
    
    proc transpose data=have_aggregated
                   out=want(drop=_NAME_);
        by id;
        id var_name;
        var var_value;
    run;
    

    【讨论】:

    • 感谢您的回答,它解决了所有问题,但我有一个问题。如果我有一些具有重复 var_name 值的 ID,该怎么办。例如,如果 ID 153879 有两个收入值?在这种情况下,上面的代码给出了以下错误ERROR: The ID value "Income" occurs twice in the same BY group. 假设我想将两个值相加,代码会是什么样子?
    • 您需要先通过idvar_name 预先聚合它们。我已使用此更改更新了代码。它将一步聚合和排序。
    猜你喜欢
    • 2017-11-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-12-26
    • 2020-04-07
    • 2017-04-01
    • 1970-01-01
    • 2016-04-02
    相关资源
    最近更新 更多