Does the Java optimizer remember calculated values?

2020-08-15 • Java

Which fizzbuzz implementation is more efficient?

public static void fizzBuzz1(int n)
    {
        boolean fizzed,buzzed;
        for(int i = 1; i <= n; i++)
        {
            fizzed = buzzed = false;
            if(i % 3 == 0)
                fizzed = true;
            if(i % 5 == 0)
                buzzed = true;
            if(fizzed  && buzzed)
                System.out.println("FizzBuzz");
            else if(fizzed)
                System.out.println("Fizz");
            else if(buzzed)
                System.out.println("Buzz");
            else
                System.out.println(i);
        }
    }

    public static void fizzBuzz2(int n)
    {
        for(int i = 1; i <= n; i++)
        {
            if(i % 3 == 0  && i % 5 == 0)
                System.out.println("FizzBuzz");
            else if(i % 3 == 0)
                System.out.println("Fizz");
            else if(i % 5 == 0)
                System.out.println("Buzz");
            else
                System.out.println(i);
        }
    }

Solution

This time, some interesting turns were taken

First, I'll try to look at the assembly generated using the original method However, JIT has some inlining and optimization, including system out. Println call, so the generated assembly output is too large (for me) to be reasonably analyzed in a reasonable time

So I simplified the whole process so that I could focus on practical problems Finally, I ran the following program:

class Test04
{
    public static void main(String args[])
    {
        long sum = 0;
        for (int i=1000; i<12000; i++)
        {
            sum += fizzBuzz1(i);
            sum += fizzBuzz2(i);
        }
        System.out.println(sum);
    }

    public static long fizzBuzz1(int n)
    {
        long sum = 0;
        for(int i = 1; i <= n; i++)
        {
            sum += fizzBuzzCore1(i);
        }
        return sum;
    }

    public static long fizzBuzzCore1(int i)
    {
        boolean fizzed = false;
        boolean buzzed = false;
        if(i % 3 == 0)
            fizzed = true;
        if(i % 5 == 0)
            buzzed = true;
        if(fizzed  && buzzed)
            return 4;
        else if(fizzed)
            return 3;
        else if(buzzed)
            return 2;
        else
            return 1;
    }


    public static long fizzBuzz2(int n)
    {
        long sum = 0;
        for(int i = 1; i <= n; i++)
        {
            sum += fizzBuzzCore2(i);
        }
        return sum;
    }

    public static long fizzBuzzCore2(int i)
    {
        if(i % 3 == 0  && i % 5 == 0)
            return 4;
        else if(i % 3 == 0)
            return 3;
        else if(i % 5 == 0)
            return 2;
        else
            return 1;
    }

}

The return value is designed to prevent him from completely optimizing the call and to extract the "core" method designed to keep the size of the component output that must be compared as small as possible

(Note: of course, these modifications may affect optimization. For example, a JIT method may have a limit on the number of bytecode instructions. Before it is considered too large to be inlined, - XX: maxinlinesize = 35, but the effects of the two methods are roughly the same, so the required information about practical problems can still be obtained

Moreover, it is not a surprise: after the last optimization, the assembly code of both methods contains the same instructions - here is the assembly of fizzbuzzcore1 for reference:

Decoding compiled method 0x00000000026c0090:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x0000000057260528} &apos;fizzBuzzCore1&apos; &apos;(I)J&apos; in &apos;Test04&apos;
  # parm0:    rdx       = int
  #           [sp+0x20]  (sp of caller)
  0x00000000026c01c0: sub    $0x18,%rsp
  0x00000000026c01c7: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                ; - Test04::fizzBuzzCore1@-1 (line 27)

  0x00000000026c01cc: movslq %edx,%r10
  0x00000000026c01cf: mov    %edx,%r11d
  0x00000000026c01d2: sar    $0x1f,%r11d        ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c01d6: imul   $0x66666667,%r10,%r8
  0x00000000026c01dd: imul   $0x55555556,%r10
  0x00000000026c01e4: sar    $0x21,%r8
  0x00000000026c01e8: sar    $0x20,%r10
  0x00000000026c01ec: mov    %r8d,%r8d
  0x00000000026c01ef: sub    %r11d,%r8d         ;*irem
                                                ; - Test04::fizzBuzzCore1@14 (line 31)

  0x00000000026c01f2: mov    %r10d,%r10d
  0x00000000026c01f5: sub    %r11d,%r10d        ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c01f8: mov    %r8d,%r11d
  0x00000000026c01fb: shl    $0x2,%r11d
  0x00000000026c01ff: add    %r8d,%r11d         ;*irem
                                                ; - Test04::fizzBuzzCore1@14 (line 31)

  0x00000000026c0202: mov    %r10d,%r9d
  0x00000000026c0205: shl    %r9d
  0x00000000026c0208: add    %r10d,%r9d         ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c020b: cmp    %r9d,%edx
  0x00000000026c020e: jne    0x00000000026c021c  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@21 (line 33)

  0x00000000026c0210: cmp    %r11d,%edx
  0x00000000026c0213: jne    0x00000000026c021c  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@25 (line 33)

  0x00000000026c0215: mov    $0x4,%eax
  0x00000000026c021a: jmp    0x00000000026c0239  ;*iload_1
                                                ; - Test04::fizzBuzzCore1@32 (line 35)

  0x00000000026c021c: cmp    %r9d,%edx
  0x00000000026c021f: jne    0x00000000026c0228  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@33 (line 35)

  0x00000000026c0221: mov    $0x3,%eax
  0x00000000026c0226: jmp    0x00000000026c0239
  0x00000000026c0228: cmp    %r11d,%edx
  0x00000000026c022b: jne    0x00000000026c0234  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@41 (line 37)

  0x00000000026c022d: mov    $0x2,%eax
  0x00000000026c0232: jmp    0x00000000026c0239
  0x00000000026c0234: mov    $0x1,%eax          ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c0239: add    $0x10,%rsp
  0x00000000026c023d: pop    %rbp
  0x00000000026c023e: test   %eax,-0x2470244(%rip)        # 0x0000000000250000
                                                ;   {poll_return}
  0x00000000026c0244: retq   
  0x00000000026c0245: hlt    
  0x00000000026c0246: hlt    
  0x00000000026c0247: hlt    
  0x00000000026c0248: hlt    
  0x00000000026c0249: hlt    
  0x00000000026c024a: hlt    
  0x00000000026c024b: hlt    
  0x00000000026c024c: hlt    
  0x00000000026c024d: hlt    
  0x00000000026c024e: hlt    
  0x00000000026c024f: hlt    
  0x00000000026c0250: hlt    
  0x00000000026c0251: hlt    
  0x00000000026c0252: hlt    
  0x00000000026c0253: hlt    
  0x00000000026c0254: hlt    
  0x00000000026c0255: hlt    
  0x00000000026c0256: hlt    
  0x00000000026c0257: hlt    
  0x00000000026c0258: hlt    
  0x00000000026c0259: hlt    
  0x00000000026c025a: hlt    
  0x00000000026c025b: hlt    
  0x00000000026c025c: hlt    
  0x00000000026c025d: hlt    
  0x00000000026c025e: hlt    
  0x00000000026c025f: hlt    
[Exception Handler]
[Stub Code]
  0x00000000026c0260: jmpq   0x000000000261c560  ;   {no_reloc}
[Deopt Handler Code]
  0x00000000026c0265: callq  0x00000000026c026a
  0x00000000026c026a: subq   $0x5,(%rsp)
  0x00000000026c026f: jmpq   0x00000000025f6f40  ;   {runtime_call}
  0x00000000026c0274: hlt    
  0x00000000026c0275: hlt    
  0x00000000026c0276: hlt    
  0x00000000026c0277: hlt

But

However, it is surprising that it does not calculate modular operations at all!

At least, there is no clear explanation: there is no IDIV instruction in this code! So JIT really tries to avoid costly differences by doing some annoying, annoying and bitter hacker attacks: instructions

0x00000000026c01d6: imul   $0x66666667,%r10
  (and following...)

It is a "no division of labor" implementation of a department For example, this method

private static int divideBy3(int n)
{
    long r10 = n;
    r10 *= 0x55555556L;
    r10 >>>= 0x20;
    long r10d = r10 & 0xFFFFFFFFL;
    return (int)r10d;
}

Use these magic constants and shift to calculate divide by 3 (similarly, for 5 divided by another constant) I haven't done mathematical calculations myself, but I can find an explanation on how to get modular operations from the "integer division by constants" document from hacker's right on page 32 of the "integer division by constants"

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Java – if I increase the memory heap, it will increase GC time

< <上一篇

Multiple classes and main methods in Java, as well as packages and namespaces

下一篇>>

搜索内容

Does the Java optimizer remember calculated values?

Solution

热门文章