【发布时间】:2022-01-03 09:15:43
【问题描述】:
我们正在为 Spring Batch 应用程序从 Oracle DB 迁移到 Azure SQL Server。
我在尝试同时执行 两个不同的作业更新不同的表但使用相同的常见 BATCH_ 表时遇到以下错误
引起:org.springframework.dao.DataAccessResourceFailureException: 无法增加身份;嵌套异常是 com.microsoft.sqlserver.jdbc.SQLServerException:事务(进程 ID 167) 与另一个进程在锁资源上死锁并且有 被选为死锁受害者。重新运行事务。在 org.springframework.jdbc.support.incrementer.SqlServerMaxValueIncrementer.getNextKey(SqlServerMaxValueIncrementer.java:124) ~[bat-applybatch-jobs-2.2.12-SNAPSHOT.jar:?] 在 org.springframework.jdbc.support.incrementer.AbstractDataFieldMaxValueIncrementer.nextLongValue(AbstractDataFieldMaxValueIncrementer.java:125)
我的作业存储库配置
<job-repository id="jobRepository" isolation-level-for-create="READ_COMMITED" />
数据库死锁
<deadlock>
<victim-list>
<victimProcess id="process2a41675a4e8" />
</victim-list>
<process-list>
<process id="process2a41675a4e8" taskpriority="0" logused="280" waitresource="RID: 6:9:24682488:29" waittime="4984" ownerId="696000712" transactionname="implicit_transaction" lasttranstarted="2021-12-29T12:18:30.153" XDES="0x29a22bc4428" lockMode="U" schedulerid="4" kpid="52760" status="suspended" spid="173" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2021-12-29T12:18:30.157" lastbatchcompleted="2021-12-29T12:18:30.153" lastattention="1900-01-01T00:00:00.153" clientapp="Microsoft JDBC Driver for SQL Server" hostname="ServerName" hostpid="0" loginname="LoginName" isolationlevel="read committed (2)" xactid="696000712" currentdb="6" currentdbname="Database" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
<executionStack>
<frame procname="unknown" queryhash="0xadc42a7474869694" queryplanhash="0x238c4f9df8a5d6cc" line="1" stmtstart="26" stmtend="146" sqlhandle="0x020000007654041849f4ffe980c136b592ccbe8260983e220000000000000000000000000000000000000000">
unknown </frame>
<frame procname="unknown" queryhash="0xadc42a7474869694" queryplanhash="0x238c4f9df8a5d6cc" line="1" stmtend="126" sqlhandle="0x0200000045a2af306ade799ae9ffa65edc0f722c526e26330000000000000000000000000000000000000000">
unknown </frame>
</executionStack>
<inputbuf>
delete from LoginName.BATCH_STEP_EXECUTION_SEQ where ID < 10899 </inputbuf>
</process>
<process id="process2a42d680ca8" taskpriority="0" logused="420" waitresource="RID: 6:9:24682490:8" waittime="4984" ownerId="696000707" transactionname="implicit_transaction" lasttranstarted="2021-12-29T12:18:30.153" XDES="0x2a41ae18428" lockMode="U" schedulerid="7" kpid="53280" status="suspended" spid="129" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2021-12-29T12:18:30.153" lastbatchcompleted="2021-12-29T12:18:30.153" lastattention="1900-01-01T00:00:00.153" clientapp="Microsoft JDBC Driver for SQL Server" hostname="ServerName" hostpid="0" loginname="LoginName" isolationlevel="read committed (2)" xactid="696000707" currentdb="6" currentdbname="Database" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
<executionStack>
<frame procname="unknown" queryhash="0xadc42a7474869694" queryplanhash="0x238c4f9df8a5d6cc" line="1" stmtstart="26" stmtend="146" sqlhandle="0x020000007654041849f4ffe980c136b592ccbe8260983e220000000000000000000000000000000000000000">
unknown </frame>
<frame procname="unknown" queryhash="0xadc42a7474869694" queryplanhash="0x238c4f9df8a5d6cc" line="1" stmtend="126" sqlhandle="0x02000000a0f1f51de77e1eefa19367c42fc9d1938c2075020000000000000000000000000000000000000000">
unknown </frame>
</executionStack>
<inputbuf>
delete from LoginName.BATCH_STEP_EXECUTION_SEQ where ID < 10898 </inputbuf>
</process>
</process-list>
<resource-list>
<ridlock fileid="9" pageid="24682488" dbid="6" objectname="162589bb-bc36-4834-8bdc-e58a2deca742.LoginName.BATCH_STEP_EXECUTION_SEQ" id="lock2a043bbcc00" mode="X" associatedObjectId="72057594071547904">
<owner-list>
<owner id="process2a42d680ca8" mode="X" />
</owner-list>
<waiter-list>
<waiter id="process2a41675a4e8" mode="U" requestType="wait" />
</waiter-list>
</ridlock>
<ridlock fileid="9" pageid="24682490" dbid="6" objectname="162589bb-bc36-4834-8bdc-e58a2deca742.LoginName.BATCH_STEP_EXECUTION_SEQ" id="lock29f5f1b7f00" mode="X" associatedObjectId="72057594071547904">
<owner-list>
<owner id="process2a41675a4e8" mode="X" />
</owner-list>
<waiter-list>
<waiter id="process2a42d680ca8" mode="U" requestType="wait" />
</waiter-list>
</ridlock>
</resource-list>
</deadlock>
试过了:
<job-repository id="jobRepository" isolation-level-for-create="READ_UNCOMMITED" />
<job-repository id="jobRepository"
isolation-level-for-create="ISOLATION_REPEATABLE_READ" />
<job-repository id="jobRepository"
isolation-level-for-create="SERIALIZABLE" />
我已经创建了如下突出显示的表格
CREATE TABLE BATCH_STEP_EXECUTION_SEQ (
ID BIGINT IDENTITY(<last Oracle sequence value>, 1)
);
CREATE TABLE BATCH_JOB_EXECUTION_SEQ (
ID BIGINT IDENTITY(<last Oracle sequence value>, 1)
);
CREATE TABLE BATCH_JOB_SEQ (
ID BIGINT IDENTITY(<last Oracle sequence value>, 1)
);
有什么问题?我该如何解决这个问题?
更新:工作定义
<bean id="simpleStep" class="org.springframework.batch.core.step.factory.SimpleStepFactoryBean"
abstract="true">
<property name="transactionManager" ref="transactionManager" />
<property name="jobRepository" ref="jobRepository" />
<property name="startLimit" value="100" />
<property name="commitInterval" value="1" />
</bean>
更新#2: 我可以试试这样的吗?
<bean id="informixIncrementer" class="com.bah.batch.informixsupport.InformixMaxValueIncrementerFactory"><property name="dataSource" ref="dataSource" />
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" isolation-level-for-create="READ_COMMITTED" table-prefix="BATCH_">
<property name="incrementerFactory" ref="informixIncrementer"/>
</bean>
【问题讨论】:
-
@mahmoud-ben-hassine - 你有什么建议吗?
-
我无法告诉您您的情况是什么问题,但通常会发生这种死锁,因为 2 个并行进程尝试以不同的顺序访问资源,因此正在等待另一个持有的资源.例如,假设一个进程尝试按该顺序处理 A、B、C、D 行,而另一个出于某种原因尝试按 D、B、A、C 的顺序进行处理。现在第一个进程已经锁定了 A、B 和 C,并试图锁定 D。但是另一个进程已经锁定了 D,并且正在等待第一个进程完成并释放 B 上的锁定 - 所以现在你有一个死锁。跨度>
-
在您的情况下,问题似乎是您有 2 个进程试图从
LoginName.BATCH_STEP_EXECUTION_SEQ中删除多个元素,如果这种情况以不同的顺序发生,则可能导致死锁。因此,要么尝试在删除过程中获得一些顺序(例如,通过使用子查询来获取要删除的 id 的有序列表),要么只是按顺序执行(例如,如果可能,锁定表)。 -
这些表不应该在 ID 上有一个索引,否则你将执行全表扫描?
-
请注意,您可以在死锁图中看到,一个删除在第 24682488 页上持有 X 锁,并在第 24682490 页上等待 U 锁,而另一个进程在第 24682488 页上拥有 U 锁并且想要第 24682490 页上的 X 锁
标签: java sql-server spring spring-batch azure-sql-server