使用 JDBC 的查询运行速度要慢得多答案

【问题标题】：Query runs much slower using JDBC使用 JDBC 的查询运行速度要慢得多
【发布时间】：2020-02-28 13:04:10
【问题描述】：

当我使用 Adminer 或 DBeaver 计时时，我有两个不同的查询需要大约相同的时间来执行

查询一个

select * from state where state_name = 'Florida';

当我在 Adminer 中运行上面的查询时，它可以从任何地方

0.032 秒到 0.058 秒

解释分析

Seq Scan on state  (cost=0.00..3981.50 rows=1 width=28) (actual time=1.787..15.047 rows=1 loops=1)
  Filter: (state_name = 'Florida'::citext)
  Rows Removed by Filter: 50
Planning Time: 0.486 ms
Execution Time: 15.779 ms

查询二

select
    property.id as property_id ,
    full_address,
    street_address,
    street.street,
    city.city as city,
    state.state_code as state_code,
    zipcode.zipcode as zipcode
from
    property
inner join street on
    street.id = property.street_id
inner join city on
    city.id = property.city_id
inner join state on
    state.id = property.state_id
inner join zipcode on
    zipcode.id = property.zipcode_id
where
    full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211';

以上查询取自

0.025 秒到 0.048 秒

解释分析

Nested Loop  (cost=29.82..65.96 rows=1 width=97) (actual time=0.668..0.671 rows=1 loops=1)
  ->  Nested Loop  (cost=29.53..57.65 rows=1 width=107) (actual time=0.617..0.620 rows=1 loops=1)
        ->  Nested Loop  (cost=29.25..49.30 rows=1 width=120) (actual time=0.582..0.585 rows=1 loops=1)
              ->  Nested Loop  (cost=28.97..41.00 rows=1 width=127) (actual time=0.532..0.534 rows=1 loops=1)
                    ->  Bitmap Heap Scan on property  (cost=28.54..32.56 rows=1 width=131) (actual time=0.454..0.456 rows=1 loops=1)
                          Recheck Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
                          Heap Blocks: exact=1
                          ->  Bitmap Index Scan on property_full_address  (cost=0.00..28.54 rows=1 width=0) (actual time=0.426..0.426 rows=1 loops=1)
                                Index Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
                    ->  Index Scan using street_pkey on street  (cost=0.42..8.44 rows=1 width=28) (actual time=0.070..0.070 rows=1 loops=1)
                          Index Cond: (id = property.street_id)
              ->  Index Scan using city_id_pk on city  (cost=0.29..8.30 rows=1 width=25) (actual time=0.047..0.047 rows=1 loops=1)
                    Index Cond: (id = property.city_id)
        ->  Index Scan using state_id_pk on state  (cost=0.28..8.32 rows=1 width=19) (actual time=0.032..0.032 rows=1 loops=1)
              Index Cond: (id = property.state_id)
  ->  Index Scan using zipcode_id_pk on zipcode  (cost=0.29..8.30 rows=1 width=22) (actual time=0.048..0.048 rows=1 loops=1)
        Index Cond: (id = property.zipcode_id)
Planning Time: 5.473 ms
Execution Time: 1.601 ms

我有以下使用 JDBCTemplate 执行相同查询的方法。

查询一个

public void performanceTest(String str) {
    template.queryForObject(
            "select * from state where state_name = ?",
            new Object[] { str }, (result, rowNum) -> {
                return result.getObject("state_name");
            });

}

时间：140ms，即0.14秒

查询二

public void performanceTest(String str) {
    template.queryForObject(
            "SELECT property.id AS property_id , full_address, street_address, street.street, city.city as city, state.state_code as state_code, zipcode.zipcode as zipcode FROM property INNER JOIN street ON street.id = property.street_id INNER JOIN city ON city.id = property.city_id INNER JOIN state ON state.id = property.state_id INNER JOIN zipcode ON zipcode.id = property.zipcode_id WHERE full_address = ?",
            new Object[] { str }, (result, rowNum) -> {
                return result.getObject("property_id");
            });

}

执行上述方法所需要的时间是

时间：828 毫秒，即 0.825 秒

我正在使用下面的代码来计时方法的执行时间

long startTime1 = System.nanoTime();
propertyRepo.performanceTest(address); //or "Florida" depending which query I'm testing
long endTime1 = System.nanoTime();
long duration1 = TimeUnit.MILLISECONDS.convert((endTime1 - startTime1), TimeUnit.NANOSECONDS);
System.out.println("time: " + duration1);

为什么从 JDBC 运行查询 2 比从 Adminer 运行时慢得多？我可以做些什么来提高查询二的性能？

编辑：

我创建了两个不同的 PHP 脚本，分别包含查询。他们使用 PHP 所花费的时间相同，所以我认为这与 JDBC 有关吗？下面是 PHP 脚本的结果。由于我没有使用任何连接池，因此 PHP 花费的时间比使用查询一的 Java 花费的时间要长。但是这两个查询都花费了几乎相同的时间来执行。某些原因导致 JDBC 上的查询 2 出现延迟。

编辑：

当我使用准备好的语句运行查询时，它很慢。但是当我用语句运行它时它很快。我使用preparedStatement 和statement 对两者做了EXPLAIN ANALYZE

preparedStatement 解释分析

Nested Loop  (cost=1.27..315241.91 rows=1 width=97) (actual time=0.091..688.583 rows=1 loops=1)
  ->  Nested Loop  (cost=0.98..315233.61 rows=1 width=107) (actual time=0.079..688.571 rows=1 loops=1)
        ->  Nested Loop  (cost=0.71..315225.26 rows=1 width=120) (actual time=0.069..688.561 rows=1 loops=1)
              ->  Nested Loop  (cost=0.42..315216.95 rows=1 width=127) (actual time=0.057..688.548 rows=1 loops=1)
                    ->  Seq Scan on property  (cost=0.00..315208.51 rows=1 width=131) (actual time=0.032..688.522 rows=1 loops=1)
                          Filter: ((full_address)::text = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::text)
                          Rows Removed by Filter: 8790
                    ->  Index Scan using street_pkey on street  (cost=0.42..8.44 rows=1 width=28) (actual time=0.019..0.019 rows=1 loops=1)
                          Index Cond: (id = property.street_id)
              ->  Index Scan using city_id_pk on city  (cost=0.29..8.30 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=1)
                    Index Cond: (id = property.city_id)
        ->  Index Scan using state_id_pk on state  (cost=0.28..8.32 rows=1 width=19) (actual time=0.008..0.008 rows=1 loops=1)
              Index Cond: (id = property.state_id)
  ->  Index Scan using zipcode_id_pk on zipcode  (cost=0.29..8.30 rows=1 width=22) (actual time=0.010..0.010 rows=1 loops=1)
        Index Cond: (id = property.zipcode_id)
Planning Time: 2.400 ms
Execution Time: 688.674 ms

语句解释分析

Nested Loop  (cost=29.82..65.96 rows=1 width=97) (actual time=0.232..0.235 rows=1 loops=1)
  ->  Nested Loop  (cost=29.53..57.65 rows=1 width=107) (actual time=0.220..0.223 rows=1 loops=1)
        ->  Nested Loop  (cost=29.25..49.30 rows=1 width=120) (actual time=0.211..0.213 rows=1 loops=1)
              ->  Nested Loop  (cost=28.97..41.00 rows=1 width=127) (actual time=0.198..0.200 rows=1 loops=1)
                    ->  Bitmap Heap Scan on property  (cost=28.54..32.56 rows=1 width=131) (actual time=0.175..0.177 rows=1 loops=1)
                          Recheck Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
                          Heap Blocks: exact=1
                          ->  Bitmap Index Scan on property_full_address  (cost=0.00..28.54 rows=1 width=0) (actual time=0.162..0.162 rows=1 loops=1)
                                Index Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
                    ->  Index Scan using street_pkey on street  (cost=0.42..8.44 rows=1 width=28) (actual time=0.017..0.017 rows=1 loops=1)
                          Index Cond: (id = property.street_id)
              ->  Index Scan using city_id_pk on city  (cost=0.29..8.30 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=1)
                    Index Cond: (id = property.city_id)
        ->  Index Scan using state_id_pk on state  (cost=0.28..8.32 rows=1 width=19) (actual time=0.007..0.007 rows=1 loops=1)
              Index Cond: (id = property.state_id)
  ->  Index Scan using zipcode_id_pk on zipcode  (cost=0.29..8.30 rows=1 width=22) (actual time=0.010..0.010 rows=1 loops=1)
        Index Cond: (id = property.zipcode_id)
Planning Time: 2.442 ms
Execution Time: 0.345 ms

【问题讨论】：

查询涉及多少数据，尝试在full_address和state_name上创建数据库索引
您不介意发布您使用的数据库系统吗？很可能在第一个查询中您会观察到 JDBC 模板开销。由于使用了绑定变量，第二个查询可能会触发不同的执行计划 - 但如果您将 RDBMS 保密，这都是推测；）
索引已经存在。该查询在我的 Java 应用程序之外运行得很快。在Adminer 中运行时，查询二的运行速度比查询一快。我不明白为什么在我的 Java 应用程序中查询二的运行速度比查询一慢
@MarmiteBomber 我使用 Postgresql。如果有帮助，我可以发布EXPLAIN ANALYZE。
@MarmiteBomber 我为每个查询附上了解释分析

标签： spring postgresql spring-boot jdbc jdbctemplate

【解决方案1】：

这是因为不同客户端使用的连接池。您可以像这样为 JDBC 设置一个快速连接池，例如 HikariC：

public class HikariCPDataSource {

    private static HikariConfig config = new HikariConfig();
    private static HikariDataSource ds;

    static {
        config.setJdbcUrl("jdbc:h2:mem:test");
        config.setUsername("user");
        config.setPassword("password");
        config.addDataSourceProperty("cachePrepStmts", "true");
        config.addDataSourceProperty("prepStmtCacheSize", "250");
        config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
        ds = new HikariDataSource(config);
    }

    public static Connection getConnection() throws SQLException {
        return ds.getConnection();
    }

    private HikariCPDataSource(){}
}

【讨论】：

据我所知，JDBCTempalate 已经在使用 Hikari 的连接池
@david Spring Boot 将尝试为您的数据源加载可用的connection pool：我们更喜欢 HikariCP 的性能和并发性。如果 HikariCP 可用，我们总是选择它。否则，如果 Tomcat 池化 DataSource 可用，我们就使用它。否则，如果 Commons DBCP2 可用，我们将使用它。如果你使用 spring-boot-starter-jdbc 或 spring-boot-starter-data-jpa “starters”，你会自动获得对 HikariCP 的依赖。
我使用的是 Spring Boot 2.0 并且在我的 pom.xml 中有 spring-boot-starter-jdbc 我认为一切都是自动配置的？
连接池对查询的实际执行没有影响。池的“性能”是指它在高负载下工作的能力，其中许多连接从池中从许多并发线程中取出。但是一旦语句执行开始，池就完全无关了