如何使用 Hibernate/Spring JPA 为 Snowflake 表生成唯一的自动增量 ID？答案

【问题标题】：How to generate Unique, AutoIncremented Id using Hibernate/Spring JPA for Snowflake table?如何使用 Hibernate/Spring JPA 为 Snowflake 表生成唯一的自动增量 ID？
【发布时间】：2022-01-18 12:59:06
【问题描述】：

我是 Snowflake 的新手，我们决定使用 Hibernate/Spring-Data JPA，而不是实现 Snowflake JDBCDriver，因为这样使用起来更方便。我们在 Snowflake 社区看到了这篇帖子：Has anybody built an application using Java Spring Framework that connects to Snowflake，并检查了我们的用例是否得到了满足。

根据我们的用例，我们的模型类看起来像这样，并且我们保持空方言部分和其他配置与链接中描述的相同。


import org.hibernate.annotations.GenericGenerator;

import javax.persistence.Basic;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.Table;
import java.io.Serializable;

@Entity
@Table(name = "note")
public class Note implements Serializable {

    @Id
    @GenericGenerator(name = "id_generator", strategy = "increment")
    @GeneratedValue(generator = "id_generator")
    private Long id;

    @Basic(optional = false)
    @Column(name = "user_id")
    private String userId;

    @Basic(optional = false)
    @Column(name = "content")
    private String content;

    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public String getUserId() {
        return userId;
    }

    public void setUserId(String userId) {
        this.userId = userId;
    }

    public String getContent() {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }

    public Note(String userId, String content) {
        this.userId = userId;
        this.content = content;
    }

    public Note() {
    }
}

我们使用这个查询在 Snowflake 中创建了表：

CREATE OR REPLACE TABLE "WAREHOUSE"."SCHEMA".note (
 id INT NOT NULL AUTOINCREMENT UNIQUE,
 user_id STRING NOT NULL, 
 content STRING NOT NULL, 
 PRIMARY KEY (id)   
);

上面的代码按预期工作，并生成自动递增的 id 作为主键。我们还尝试运行我们服务的多个实例，并且由于 Snowflake 没有强制执行唯一约束，我们遇到了重复 id 值的问题。（单个表有多个数据插入源。）

关于方言，我们找不到任何适用于 Snowflake 的 Hibernate 方言，因此我们使用了与参考链接中所述相同的方言详细信息。我们创建了 EmptyDialect 类并在属性文件中给出了它的路径。

public class EmptyDialect extends org.hibernate.dialect.Dialect {}

属性文件：

spring.jpa.properties.hibernate.dialect= absolute path of the EmptyDialect class

我们已经尝试了 IDENTITY、SEQUENCE、AUTO 等所有 ID 生成策略，但收到了可能是由于没有单独的雪花方言而导致的异常。如果需要，将添加错误的堆栈跟踪。

序列法：

我们通过以下查询创建了序列，并相应地对表创建查询和注释进行了更改。

create or replace sequence "Warehouse"."Schema".sequence_note start = 1 increment = 1;

CREATE OR REPLACE TABLE "Warehouse"."Schema".note (
 id INT NOT NULL DEFAULT "Warehouse"."Schema".SEQUENCE_NOTE.nextval UNIQUE,
 user_id STRING NOT NULL, 
 content STRING NOT NULL, 
 PRIMARY KEY (id)   
);

@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "sequence_note")
private Long id;

Hibernate 在插入 Entity 时会执行如下查询：

select next_val as id_val from sequence_note for update

错误堆栈跟踪：

{"time":"2021-12-20T06:11:07.335+00:00","@version":1,"message":"SQL Error: 1003, SQLState: 42000","logger_name":"org.hibernate.engine.jdbc.spi.SqlExceptionHelper","thread_name":"http-nio-8080-exec-2","level":"WARN","caller_class_name":"org.hibernate.engine.jdbc.spi.SqlExceptionHelper","caller_method_name":"logExceptions","caller_file_name":"SqlExceptionHelper.java","caller_line_number":137}
{"time":"2021-12-20T06:11:07.337+00:00","@version":1,"message":"SQL compilation error:
syntax error line 1 at position 45 unexpected 'for'.","logger_name":"org.hibernate.engine.jdbc.spi.SqlExceptionHelper","thread_name":"http-nio-8080-exec-2","level":"ERROR","caller_class_name":"org.hibernate.engine.jdbc.spi.SqlExceptionHelper","caller_method_name":"logExceptions","caller_file_name":"SqlExceptionHelper.java","caller_line_number":142}
{"time":"2021-12-20T06:11:12.428+00:00","@version":1,"message":"Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.InvalidDataAccessResourceUsageException: error performing isolated work; SQL [n/a]; nested exception is org.hibernate.exception.SQLGrammarException: error performing isolated work] with root cause","logger_name":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet]","thread_name":"http-nio-8080-exec-2","level":"ERROR","stack_trace":"net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
syntax error line 1 at position 45 unexpected 'for'.
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowExceptionSub(SnowflakeUtil.java:127)
at net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:67)
at net.snowflake.client.core.StmtUtil.pollForOutput(StmtUtil.java:442)
at net.snowflake.client.core.StmtUtil.execute(StmtUtil.java:345)
at net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:487)
at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:198)
at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:135)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:781)
at net.snowflake.client.core.SFStatement.execute(SFStatement.java:677)
at net.snowflake.client.jdbc.SnowflakeStatementV1.executeQueryInternal(SnowflakeStatementV1.java:238)
at net.snowflake.client.jdbc.SnowflakePreparedStatementV1.executeQuery(SnowflakePreparedStatementV1.java:117)

那么，有没有什么方法可以从 Springboot 代码本身管理 Id 字段（唯一、自增、主键）的生成？

2022 年 3 月 1 日更新

感谢 Alexey Veleshko 的回答，我们设法通过对代码进行以下更改来解决此异常。

EmptyDialect 类现在看起来像这样：

public class EmptyDialect extends org.hibernate.dialect.Dialect {
     
    @Override
    public String getSelectSequenceNextValString(String sequenceName) {
        return sequenceName + ".nextVal";
    }

    @Override
    public String getSequenceNextValString(String sequenceName) {
        return "select " + getSelectSequenceNextValString(sequenceName);
    }

    @Override
    public boolean supportsSequences() {
        return true;
    }

    @Override
    public boolean supportsPooledSequences() {
        return true;
    }
  
}

在这里，我们将覆盖将查询以从中获取 nextVal 的方法基础数据库，并将为表生成自动增量 ID。

但是，根据我们的用例，我们希望批量插入实体，即使服务的多个实例正在运行，它也应该始终为每个实体的 Id 生成一个唯一且自动递增的值。在这种情况下，当应用程序启动时，在插入实体时，Hibernate 将查询从序列中获取 nextVal。现在将使用生成的 id 值插入一批实体，对于另一批的插入，Hibernate 不会在雪花序列中查询 nextVal，而是会从其本地内存中获取最后一个值（最后生成的 nextVal + no. of inserted实体）。现在假设有多个应用程序实例在运行并在雪花中插入实体。由于这些实例不会在每次插入时在数据库中查询 nextVal，因此这些实例可能会在其本地内存中存储相同的 nextVal，这会导致数据库中的 id 重复。

【问题讨论】：

你有没有想过使用Java的UUID？我知道它不是 Long 类型，但它可以满足您的要求。
@Sergiu 你的意思是我们应该使用 id 的数据类型作为 UUID 而不是 Long？
我的意思是 UUID 将是一个字符串而不是一个 Long。但它可用于生成唯一的字符串 ID。
@Sergiu 感谢您的建议，但 id 字段将用于查询过滤器和 JOINS，因此使用 Long/Integer 会更高效。
@JilvaSheth 是的。您可以通过覆盖方法在您的方言中定义它。我没有这方面的经验，但从Javadoc 来看，它应该是方法getSelectSequenceNextValString。

标签： java hibernate spring-data-jpa snowflake-cloud-data-platform auto-increment

【解决方案1】：

除了使用自动增量功能，您还可以使用序列，然后您可以使用它们自己的原生属性对其进行操作。

您可以在此处阅读有关序列及其建议用法的文档：

CREATE SEQUENCE — Snowflake Documentation

Using Sequences — Snowflake Documentation

【讨论】：

谁不推荐？对于许多 Hibernate 功能，序列工作得更好。比如说，批处理。
对不起，我误解了原文的上下文。序列在这里应该可以正常工作。
@SergeGershkovich 我们也尝试了序列方法，但收到了异常。我已在问题本身中添加了有关该问题的详细信息和错误堆栈跟踪。

【解决方案2】：

我猜这个问题的主旨是您想通过多个应用程序实例在同一个雪花表上执行批量插入，该表具有唯一且自动递增的主键 (id)。

Snowflake 使用 SEQUENCE 策略来自动增加主键 ID。

现在正如您和 Alexey Veleshko 所提到的，我们可以通过重写 Dialect 类中的一些方法来操作序列查询以“select sequence_note.nextVal”。但由于此查询只会在启动时由您的应用程序触发一次，因此当有多个应用程序实例尝试将一批数据插入同一个表中时，它不会解决问题。

所以，我们这里需要执行的是：

"select sequence_note.nextVal"

在每次插入调用之前查询以获取每行的最新 "id" 值。我们可以通过手动执行来实现这种行为。

entitiesToSave.stream().filter(Objects::nonNull).forEach(entity -> {
    int nextVal = sequenceRepo.getNextVal();
    entity.setId(Long.valueOf(nextVal));
    entityRepo.save(entity);
});



 @Query(nativeQuery = true, value = "select sequence_note.nextVal")
    int getNextVal();

我知道这不是最佳解决方案，因为您需要在每次插入调用之前执行额外的查询，但查看您的用例，我认为这可能是您最后的手段。另一种解决方案是使用以下查询：

insert into note (user_id, content) values (1, "testContent");

因为您的表生成查询是：

CREATE OR REPLACE TABLE "Warehouse"."Schema".note (
 id INT NOT NULL DEFAULT "Warehouse"."Schema".SEQUENCE_NOTE.nextval UNIQUE,
 user_id STRING NOT NULL, 
 content STRING NOT NULL, 
 PRIMARY KEY (id)   
);

它将直接为您管理 id 生成（但是，如果您严格地想使用 Spring JPA，这将不适合您使用）

【讨论】：