【问题标题】:Select DISTINCT on JPA在 JPA 上选择 DISTINCT
【发布时间】:2017-08-07 09:59:03
【问题描述】:

我有一个带有 ISO 4217 values 货币的表(有 6 行,ID、Country、Currency_Name、Alphabetic_code、Numeric_Code、Minor_Unit)。

我需要获取4 most used currencies 的一些数据,而我的“纯”SQL 查询如下所示:

select distinct currency_name, alphabetic_code, numeric_code 
from currency 
where ALPHABETIC_CODE IN ('USD','EUR','JPY','GBP') 
order by currency_name;

返回一个包含我需要的数据的 4 行表。到目前为止,一切都很好。

现在我必须把它翻译成我们的 JPA xml 文件,然后问题就开始了。我试图得到的查询是这样的:

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

这会返回一个列表,其中每个国家/地区都有其中一些货币(就好像查询中没有“DISTINCT”一样)。我正在为为什么而摸不着头脑。所以问题是:

1) 我怎样才能使这个查询返回“纯”SQL 给我的信息?

2) 为什么这个查询似乎忽略了我的“DISTINCT”子句?我在这里缺少一些东西,我不明白什么。发生了什么事,我没有得到什么?

编辑:嗯,这越来越奇怪了。不知何故,该 JPA 查询按预期工作(返回 4 行)。我已经尝试过了(因为我需要更多信息):

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode, c.minorUnit, c.id
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

似乎 ID 搞砸了一切,因为从查询中删除它会返回返回 4 行表。而且加括号也没用。

顺便说一句,我们正在使用eclipse链接。

【问题讨论】:

  • 您的 JPQ 提供商的日志会告诉您它已将 JPQL 翻译成的 SQL。

标签: java sql jpa jpql


【解决方案1】:

您遇到的问题是当您尝试检索列列表时(c.currencyName, c.alphabeticCode, c.numericCode, c.minorUnit, c.id)

  • distinct 在 select 子句中提到的整个列上运行

而且我相信“id”列对于您的 db 表中的每条记录都是唯一的,因此您有可能在其他列中获得重复项 (c.currencyName, c.alphabeticCode, c.numericCode, c.minorUnit)

因此,在您的情况下,DISTINCT 在整行上运行,而不是 具体栏目。如果要获取唯一名称,请仅选择 列。

如果您想在多个列上运行 distinct,您可以执行类似的操作,例如使用 GROUP BY 来使用 c.currencyName, c.alphabeticCode 进行查找

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode,c.id
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') GROUP BY c.currencyName, c.alphabeticCode
ORDER BY c.currencyName

【讨论】:

  • 好的,现在我明白了,在运行了更多测试之后,我将归咎于某些查询中的拼写错误,因为 DISTINCT 在 SQL 和 JPA 上的行为不同。顺便说一句,GROUP BY 子句不起作用……我想我暂时不考虑 ID。
【解决方案2】:

为了回答您的问题,您编写的 JPQL 查询很好:

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

它应该转换为您期望的 SQL 语句:

select distinct currency_name, alphabetic_code, numeric_code 
from currency 
where ALPHABETIC_CODE IN ('USD','EUR','JPY','GBP') 
order by currency_name;

根据底层 JPQL 或 Criteria API 查询类型,[DISTINCT][1] 在 JPA 中有两种含义。

标量查询

对于返回标量投影的标量查询,如以下查询:

List<Integer> publicationYears = entityManager
.createQuery(
    "select distinct year(p.createdOn) " +
    "from Post p " +
    "order by year(p.createdOn)", Integer.class)
.getResultList();

LOGGER.info("Publication years: {}", publicationYears);

DISTINCT 关键字应该传递给底层 SQL 语句,因为我们希望 DB 引擎在返回结果集之前过滤重复:

SELECT DISTINCT
    extract(YEAR FROM p.created_on) AS col_0_0_
FROM
    post p
ORDER BY
    extract(YEAR FROM p.created_on)

-- Publication years: [2016, 2018]

实体查询

对于实体查询,DISTINCT 有不同的含义。

不使用DISTINCT,查询如下:

List<Post> posts = entityManager
.createQuery(
    "select p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();

LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

将像这样加入postpost_comment 表:

SELECT p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1, 1]

但是父 post 记录在每个关联的 post_comment 行的结果集中重复。因此,Post 实体中的List 将包含重复的Post 实体引用。

要消除Post实体引用,我们需要使用DISTINCT

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

但随后DISTINCT 也被传递给 SQL 查询,这根本不可取:

SELECT DISTINCT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

通过将DISTINCT 传递给SQL 查询,EXECUTION PLAN 将执行一个额外的Sort 阶段,这会增加开销而不会带来任何价值,因为父子组合总是返回唯一记录,因为子PK栏目:

Unique  (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)
  ->  Sort  (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)
        Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review
        Sort Method: quicksort  Memory: 25kB
        ->  Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)
              Hash Cond: (pc.post_id = p.id)
              ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)
              ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                    ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)
                          Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
                          Rows Removed by Filter: 3
Planning time: 0.227 ms
Execution time: 0.179 ms

带有 HINT_PASS_DISTINCT_THROUGH 的实体查询

要从执行计划中消除排序阶段,我们需要使用HINT_PASS_DISTINCT_THROUGH JPA 查询提示:

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

现在,SQL 查询将不包含 DISTINCTPost 实体引用重复项将被删除:

SELECT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

执行计划将确认这次我们不再有额外的排序阶段:

Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)
  Hash Cond: (pc.post_id = p.id)
  ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)
  ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)
              Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
              Rows Removed by Filter: 3
Planning time: 1.184 ms
Execution time: 0.160 ms

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2016-10-31
    • 2010-11-01
    • 2021-02-28
    • 1970-01-01
    • 1970-01-01
    • 2016-03-08
    • 2013-08-17
    • 1970-01-01
    相关资源
    最近更新 更多