【问题标题】:Postgresql order by array overlapPostgresql 按数组重叠排序
【发布时间】:2020-05-05 07:19:25
【问题描述】:

我想用数组对表进行排序。重叠最多的记录应该在顶部。我已经有一个 where 语句来过滤带有数组的记录。使用相同的数组,我想确定排序的重叠数。你知道 order by 语句的样子吗?

我的桌子

SELECT * FROM "nodes"
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Max       | ["foo", "orange", "app"]  |
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
| John      | ["apple"]                 |
+-----------+---------------------------+

结果与位置

SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}')
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Max       | ["foo", "orange", "app"]  |
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
+-----------+---------------------------+

订单结果

SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}') ORDER BY ????
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
| Max       | ["foo", "orange", "app"]  |
+-----------+---------------------------+

【问题讨论】:

    标签: sql arrays json postgresql sql-order-by


    【解决方案1】:

    我唯一能想到的就是创建一个计算公共元素数量的函数:

    create or replace function num_overlaps(p_one text[], p_other text[])
      returns bigint
    as
    $$
      select count(*)
      from (
        select *
        from unnest(p_one)
        intersect   
        select *
        from unnest(p_other)
      ) x
    $$
    language sql
    immutable;
    

    然后在order by子句中使用:

    SELECT *
    FROM nodes 
    WHERE tags && '{"foo", "bar", "baz"}'
    order by num_overlaps(tags, '{"foo", "bar", "baz"}') desc;
    

    缺点是,您需要重复您正在测试的标签列表。


    我不清楚这些值是 JSON 数组(因为这是示例数据中的语法)还是原生 Postgres 数组(因为 && 运算符不适用于 JSON 数组) - 如果您使用的是 @ 987654325@你可以用jsonb_array_elements_text()替换unnest()

    【讨论】:

      【解决方案2】:

      首先&&运算符两边的标识符都需要数组,比如STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[]代替tagsSTRING_TO_ARRAY('foo,bar,baz',','))代替'{"foo", "bar", "baz"}'模式。

      然后,您可以使用JSON_ARRAY_ELEMENTS() 函数取消嵌套标签列的数组元素,以便通过使用STRPOS()SIGN() 函数计算'{"foo", "bar", "baz"}' 模式内返回的value 列的每个元素的出现次数连同SUM() 聚合:

      SELECT name, tags::text
        FROM "nodes" 
       CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
       WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] 
            && STRING_TO_ARRAY('foo,bar,baz',','))
       GROUP BY name, tags::text    
       ORDER BY SUM( SIGN( STRPOS('{"foo", "bar", "baz"}'::text,value::text) ) ) DESC
      

      但是,tags 列中可能有重复的元素。在这种情况下,上述查询失败。所以,我建议使用下面这个包含被DISTINCT关键字消除的行:

      SELECT name, tags 
        FROM
        (
         SELECT DISTINCT name, tags::text, STRPOS('{"foo", "bar", "baz"}'::text,value::text)
           FROM "nodes" 
          CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
          WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] 
               && STRING_TO_ARRAY('foo,bar,baz',','))
        ) n    
        GROUP BY name, tags::text    
        ORDER BY SUM( SIGN( strpos ) ) DESC
      

      Demo

      【讨论】:

        猜你喜欢
        • 2013-03-17
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2018-03-02
        • 2018-11-11
        • 1970-01-01
        相关资源
        最近更新 更多