【问题标题】:When is Java loop predication optimization triggered by the C2 JIT compiler?C2 JIT 编译器何时触发 Java 循环预测优化?
【发布时间】:2026-01-10 18:35:01
【问题描述】:

我正在尝试理解从 Java 循环生成的本机代码。本机代码应该由 C2 编译器优化,但在我的简单示例中,似乎缺少一些优化。

这是我基于https://wiki.openjdk.java.net/display/HotSpot/LoopPredication的最小示例编写的Java方法:

104    public static byte[] myLoop(int init, int limit, int stride, int scale, int offset, byte value, byte[] array) {
105     for (int i = init; i < limit; i += stride) {
106         array [ scale * i + offset] = value;
107     }
108     return array;
109    }

这些是给 Java 8 Hotspot VM 以强制 C2 编译的参数:

-server
-XX:-TieredCompilation
-XX:CompileThreshold=5
-XX:+UnlockDiagnosticVMOptions 
-XX:+PrintAssembly
-XX:-UseCompressedOops
-XX:+LogCompilation
-XX:+TraceClassLoading
-XX:+UseLoopPredicate
-XX:+RangeCheckElimination

这是C2生成的amd64原生代码('myLoop'至少被调用10000次):

  # {method} {0x00007fcb5088ef38} 'myLoop' '(IIIIIB[B)[B' in 'MyClass'                                                                                                                                                                                                                                                                                      
  # parm0:    rsi       = int
  # parm1:    rdx       = int
  # parm2:    rcx       = int
  # parm3:    r8        = int
  # parm4:    r9        = int
  # parm5:    rdi       = byte
  # parm6:    [sp+0x40]   = '[B'  (sp of caller)
  0x00007fcd44ee9fe0: mov     %eax,0xfffffffffffec000(%rsp)
  0x00007fcd44ee9fe7: push    %rbp
  0x00007fcd44ee9fe8: sub     $0x30,%rsp        ;*synchronization entry
                                                ; - MyClass::myLoop@-1 (line 105)

  0x00007fcd44ee9fec: cmp     %edx,%esi
  0x00007fcd44ee9fee: jnl     0x7fcd44eea04a    ;*if_icmplt
                                                ; - MyClass::myLoop@27 (line 105)

  0x00007fcd44ee9ff0: mov     0x40(%rsp),%rax
  0x00007fcd44ee9ff5: mov     0x10(%rax),%r10d  ;*bastore
                                                ; - MyClass::myLoop@17 (line 106)
                                                ; implicit exception: dispatches to 0x00007fcd44eea051
  0x00007fcd44ee9ff9: nopl    0x0(%rax)         ;*aload
                                                ; - MyClass::myLoop@6 (line 106)

  0x00007fcd44eea000: mov     %esi,%ebx
  0x00007fcd44eea002: imull   %r8d,%ebx
  0x00007fcd44eea006: add     %r9d,%ebx         ;*iadd
                                                ; - MyClass::myLoop@14 (line 106)

  0x00007fcd44eea009: cmp     %r10d,%ebx
  0x00007fcd44eea00c: jnb     0x7fcd44eea02e    ;*bastore
                                                ; - MyClass::myLoop@17 (line 106)

  0x00007fcd44eea00e: add     %ecx,%esi         ;*iadd
                                                ; - MyClass::myLoop@21 (line 105)

  0x00007fcd44eea010: movsxd  %ebx,%r11
  0x00007fcd44eea013: mov     %dil,0x18(%rax,%r11)  ; OopMap{rax=Oop off=56}
                                                ;*if_icmplt
                                                ; - MyClass::myLoop@27 (line 105)

  0x00007fcd44eea018: test    %eax,0xa025fe2(%rip)  ;   {poll}
  0x00007fcd44eea01e: cmp     %edx,%esi
  0x00007fcd44eea020: jl      0x7fcd44eea000    ;*synchronization entry
                                                ; - MyClass::myLoop@-1 (line 105)

  0x00007fcd44eea022: add     $0x30,%rsp
  0x00007fcd44eea026: pop     %rbp
  0x00007fcd44eea027: test    %eax,0xa025fd3(%rip)  ;   {poll_return}
  0x00007fcd44eea02d: retq
  0x00007fcd44eea02e: movabs  $0x7fcca3c810a8,%rsi  ;   {oop(a 'java/lang/ArrayIndexOutOfBoundsException')}
  0x00007fcd44eea038: movq    $0x0,0x18(%rsi)   ;*bastore
                                                ; - MyClass::myLoop@17 (line 106)

  0x00007fcd44eea040: add     $0x30,%rsp
  0x00007fcd44eea044: pop     %rbp
  0x00007fcd44eea045: jmpq    0x7fcd44e529a0    ;   {runtime_call}
  0x00007fcd44eea04a: mov     0x40(%rsp),%rax
  0x00007fcd44eea04f: jmp     0x7fcd44eea022
  0x00007fcd44eea051: mov     %edx,%ebp
  0x00007fcd44eea053: mov     %ecx,0x40(%rsp)
  0x00007fcd44eea057: mov     %r8d,0x44(%rsp)
  0x00007fcd44eea05c: mov     %r9d,(%rsp)
  0x00007fcd44eea060: mov     %edi,0x4(%rsp)
  0x00007fcd44eea064: mov     %rax,0x8(%rsp)
  0x00007fcd44eea069: mov     %esi,0x10(%rsp)
  0x00007fcd44eea06d: mov     $0xffffff86,%esi
  0x00007fcd44eea072: nop
  0x00007fcd44eea073: callq   0x7fcd44dea1a0    ; OopMap{[8]=Oop off=152}
                                                ;*aload
                                                ; - MyClass::myLoop@6 (line 106)
                                                ;   {runtime_call}
  0x00007fcd44eea078: callq   0x7fcd4dc47c50    ;*aload
                                                ; - MyClass::myLoop@6 (line 106)
                                                ;   {runtime_call}
  0x00007fcd44eea07d: hlt
  0x00007fcd44eea07e: hlt
  0x00007fcd44eea07f: hlt

根据https://wiki.openjdk.java.net/display/HotSpot/LoopPredication,一种称为“数组范围消除”的优化消除了循环内的数组范围检查,但在循环之前添加了一个循环谓词。似乎 C2 尚未对“myLoop”进行此优化。循环的向后跳转在 0x7fcd44eea020 并跳回到 0x7fcd44eea000。在循环中,仍然在 0x7fcd44eea009-0x7fcd44eea00c 处进行范围检查。

  1. 为什么循环中还有检查?
  2. 为什么没有运行循环预测优化?
  3. 如何强制进行所有优化?

【问题讨论】:

    标签: arrays loops optimization jvm jit


    【解决方案1】:

    解释就在same page

    从上面的例子来看,执行循环预测的要求 对于数组范围检查消除是 initlimitoffsetarray a 是循环不变量,stridescale 是编译时间 常量。

    在您的示例中,scalestride 不是编译时常量,因此优化失败。

    但是,如果您使用常量参数调用此方法,HotSpot 将能够消除由于内联和常量传播优化而导致的范围检查。

    【讨论】:

    • 另外,您不需要指定 UseLoopPredicateRangeCheckElimination 标志 - 它们默认启用。