将不同的行值组合成一个字符串 - sql答案

【问题标题】：combine distinct row values into a string - sql将不同的行值组合成一个字符串 - sql
【发布时间】：2017-09-06 17:43:21
【问题描述】：

我想把每一行的单元格变成一串名字……我的方法已经处理了大小写。

例如，桌子；

'john' |        | 'smith' | 'smith'    
'john' | 'paul' |         | 'smith'
'john' | 'john' | 'john'  |

'john smith'
'john paul smith'
'john'

这需要运行 postgres 的 postgreSQL 8.2.15，因此我无法使用 CONCAT 等潜在有用的功能，并且数据位于 greenplum db 中。

另外，一种直接删除字符串列表中重复标记的方法可以让我实现更大的目标。例如：

'john smith john smith'
'john john smith'
'smith john smith'

'john smith'
'john smith'
'smith john'

标记的顺序并不重要，只要返回所有唯一值，只返回一次。

谢谢

【问题讨论】：

这似乎是一个糟糕的数据库设计，我认为您需要一个应用层。

标签： sql postgresql distinct string-aggregation

【解决方案1】：

规范化您的表结构，从该表中选择不同的名称值，创建一个函数来聚合字符串（例如，参见How to concatenate strings of a string field in a PostgreSQL 'group by' query?），然后应用该函数。除了聚合函数的创建，这一切都可以在单个语句或视图中完成。

【讨论】：

谢谢 - 不知道我是怎么错过这个问题的；我花了好几个小时浏览类似的问题。

【解决方案2】：

我已经为你想出了一个解决方案！ :)

以下查询返回四列（我将它们命名为 col_1、2、3 和 4），并通过将 test_table 与其自身连接来删除重复项。

代码如下：

SELECT t1.col_1, t2.col_2, t3.col_3, t4.col_4

FROM (
    SELECT id, col_1
        FROM test_table
) AS t1

LEFT JOIN (
    SELECT id, col_2
        FROM test_table
) as t2

ON (t2.id = t1.id and t2.col_2 <> t1.col_1)


LEFT JOIN (
    SELECT id, col_3
        FROM test_table
) as t3

ON (t3.id = t1.id and t3.col_3 <> t1.col_1 and t3.col_3 <> t2.col_2)



LEFT JOIN (
    SELECT id, col_4
        FROM test_table
) as t4

ON (t4.id = t1.id and t4.col_4 <> t1.col_1 and t4.col_4 <> t2.col_2 and t4.col_4 <> t3.col_3);

如果您想获得最终的字符串，只需将“SELECT”行替换为该行即可：

SELECT trim(both ' ' FROM  (COALESCE(t1.col_1, '') || ' ' ||  COALESCE(t2.col_2, '') || ' ' || COALESCE(t3.col_3, '') || ' ' || COALESCE(t4.col_4, '')))

根据文档，这应该适用于您的 postgres 版本：

[用于修剪和连接功能]

https://www.postgresql.org/docs/8.2/static/functions-string.html

//*************************************************** ******

[用于合并功能]

https://www.postgresql.org/docs/8.2/static/functions-conditional.html

如果我有帮助请告诉我:)

附：您的问题听起来像是一个糟糕的数据库设计：我会将这些列移动到一个表上，您可以在其中使用 group by 或类似的东西来执行此操作。此外，我会在单独的脚本上进行字符串连接。但这是我的做法:)

【讨论】：

【解决方案3】：

我会通过取消透视数据然后重新聚合来做到这一点：

select id, string_agg(distinct col)
from (select id, col1 from t union all
      select id, col2 from t union all
      select id, col3 from t union all
      select id, col4 from t
     ) t
where col is not null
group by id;

这假定每一行都有一个唯一的 id。

你也可以使用巨无霸case：

select concat_ws(',',
                 col1,
                 (case when col2 <> col1 then col2 end),
                 (case when col3 <> col2 and col3 <> col1 then col3 end),
                 (case when col4 <> col3 and col4 <> col2 and col4 <> col1 then col4 end)
                ) as newcol
from t;

在 Postgres 的古代版本中，您可以这样表述：

select trim(leading ',' from
            (coalesce(',' || col1, '') ||
             (case when col2 <> col1 then ',' || col2 else '' end) ||
             (case when col3 <> col2 and col3 <> col1 then ',' || col3 else '' end),
             (case when col4 <> col3 and col4 <> col2 and col4 <> col1 then ',' || col4 else '' end)
            )
           ) as newcol
from t;

【讨论】：

PG 8.2 中不提供 string_agg() 函数。
@rd_nielsen 。 . .还有第二个答案。
是的； concat() 函数也不在 PG 8.2 中。我认为使用 string_agg() 的方法是最好的，但需要添加自定义聚合函数（这在 PG 中非常简单）。
@rd_nielsen 。 . .我调整了第二个解决方案。
只要没有任何情况，例如“john”在第一列，“smith”在第二列，其余列为空，这应该可以工作，因为这将与示例第一行的结果相冲突。目前尚不清楚这种情况是否会发生。