Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure at omr/compiler/x/codegen/OMRCodeGenerator.cpp:2122: cursorInstruction->getEstimatedBinaryLength() >= self()->getBinaryBufferCursor() - instructionStart when PROD_WITH_ASSUMES is enabled #17442

Closed
dylanjtuttle opened this issue May 19, 2023 · 14 comments · Fixed by eclipse-omr/omr#7056

Comments

@dylanjtuttle
Copy link
Contributor

The assertion at

/home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/omr/compiler/x/codegen/OMRCodeGenerator.cpp:2122: cursorInstruction->getEstimatedBinaryLength() >= self()->getBinaryBufferCursor() - instructionStart

fires when building OpenJDK 11 on x86-64_linux with the -DPROD_WITH_ASSUMES flag enabled in openj9/runtime/compiler/CMakeLists.txt.

Note that this assertion fires during the build process and not while running tests like the other failures under PROD_WITH_ASSUMES as discovered last year.

Link to the Jenkins Build.

Stack trace:

12:02:44  Compiling 4 files for BUILD_JIGSAW_TOOLS
12:02:45  Assertion failed at /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/omr/compiler/x/codegen/OMRCodeGenerator.cpp:2122: cursorInstruction->getEstimatedBinaryLength() >= self()->getBinaryBufferCursor() - instructionStart
12:02:45  VMState: 0x0005ff09
12:02:45  	Instruction length estimate must be conservatively large (instr=(unknown), opcode=(unknown), estimate=17, actual=18
12:02:45  compiling sun/nio/fs/UnixFileSystemProvider.newDirectoryStream(Ljava/nio/file/Path;Ljava/nio/file/DirectoryStream$Filter;)Ljava/nio/file/DirectoryStream; at level: warm
12:02:45  #0: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x99c2a5) [0x146876a9a2a5]
12:02:45  #1: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x9a99f0) [0x146876aa79f0]
12:02:45  #2: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x62666e) [0x14687672466e]
12:02:45  #3: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x6269d3) [0x1468767249d3]
12:02:45  #4: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0xa64bf0) [0x146876b62bf0]
12:02:45  #5: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a7c5a) [0x1468766a5c5a]
12:02:45  #6: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a9539) [0x1468766a7539]
12:02:45  #7: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a5f33) [0x1468766a3f33]
12:02:45  #8: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5d63f9) [0x1468766d43f9]
12:02:45  #9: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x1555e2) [0x1468762535e2]
12:02:45  #10: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x156839) [0x146876254839]
12:02:45  #11: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9prt29.so(+0x2b8b3) [0x146877d9b8b3]
12:02:45  #12: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x153b96) [0x146876251b96]
12:02:45  #13: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x154203) [0x146876252203]
12:02:45  #14: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x152c2c) [0x146876250c2c]
12:02:45  #15: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x1531d8) [0x1468762511d8]
12:02:45  #16: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x153272) [0x146876251272]
12:02:45  #17: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9prt29.so(+0x2b8b3) [0x146877d9b8b3]
12:02:45  #18: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x15369f) [0x14687625169f]
12:02:45  #19: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9thr29.so(+0xe4f6) [0x146877b634f6]
12:02:45  #20: /lib64/libpthread.so.0(+0x7aa1) [0x14687dcf9aa1]
12:02:45  #21: /lib64/libc.so.6(clone+0x6d) [0x14687d631c4d]
12:02:45  
12:02:45  JIT: crashed while compiling sun/nio/fs/UnixFileSystemProvider.newDirectoryStream(Ljava/nio/file/Path;Ljava/nio/file/DirectoryStream$Filter;)Ljava/nio/file/DirectoryStream; (recoverable 0)
12:02:45  #0: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x99c2a5) [0x146876a9a2a5]
12:02:45  #1: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x9a99f0) [0x146876aa79f0]
12:02:45  #2: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x13e8c1) [0x14687623c8c1]
12:02:45  #3: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9prt29.so(+0x2ad7a) [0x146877d9ad7a]
12:02:45  #4: /lib64/libpthread.so.0(+0xf7e0) [0x14687dd017e0]
12:02:45  #5: /lib64/libpthread.so.0(raise+0x2b) [0x14687dd016ab]
12:02:45  #6: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x6267a7) [0x1468767247a7]
12:02:45  #7: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x6269d8) [0x1468767249d8]
12:02:45  #8: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0xa64bf0) [0x146876b62bf0]
12:02:45  #9: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a7c5a) [0x1468766a5c5a]
12:02:45  #10: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a9539) [0x1468766a7539]
12:02:45  #11: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5a5f33) [0x1468766a3f33]
12:02:45  #12: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x5d63f9) [0x1468766d43f9]
12:02:45  #13: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x1555e2) [0x1468762535e2]
12:02:45  #14: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x156839) [0x146876254839]
12:02:45  #15: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9prt29.so(+0x2b8b3) [0x146877d9b8b3]
12:02:45  #16: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x153b96) [0x146876251b96]
12:02:45  #17: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x154203) [0x146876252203]
12:02:45  #18: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x152c2c) [0x146876250c2c]
12:02:45  #19: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x1531d8) [0x1468762511d8]
12:02:45  #20: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x153272) [0x146876251272]
12:02:45  #21: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9prt29.so(+0x2b8b3) [0x146877d9b8b3]
12:02:45  #22: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9jit29.so(+0x15369f) [0x14687625169f]
12:02:45  #23: /home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/build/linux-x86_64-normal-server-release/jdk/lib/default/libj9thr29.so(+0xe4f6) [0x146877b634f6]
12:02:45  #24: /lib64/libpthread.so.0(+0x7aa1) [0x14687dcf9aa1]
12:02:45  #25: /lib64/libc.so.6(clone+0x6d) [0x14687d631c4d]
12:02:45  Unhandled exception
12:02:45  Type=Unhandled trap vmState=0x0005ff09
12:02:45  J9Generic_Signal_Number=00000108 Signal_Number=00000005 Error_Value=00000000 Signal_Code=fffffffa
12:02:45  Handler1=000014687C07FDA0 Handler2=0000146877D9AB50
12:02:45  RDI=000000000001482C RSI=0000000000014833 RAX=0000000000000000 RBX=0000146876CEA288
12:02:45  RCX=000014687DD016AB RDX=0000000000000005 R8=0000146876CEB4E0 R9=00001468744BB700
12:02:45  R10=0000000000015210 R11=0000000000000202 R12=0000146876CEB4E0 R13=0000146876CEB478
12:02:45  R14=0000146876C1E80C R15=00001468733C1800
12:02:45  RIP=000014687DD016AB GS=0000 FS=0000 RSP=00001468744B33A8
12:02:45  EFlags=0000000000000202 CS=0033 RBP=000000000000084A ERR=0000000000000000
12:02:45  TRAPNO=0000000000000000 OLDMASK=0000000000000000 CR2=0000000000000000
12:02:45  xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm1 00000000ff000000 (f: 4278190080.000000, d: 2.113707e-314)
12:02:45  xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm5 3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
12:02:45  xmm6 fffffffcfffffffc (f: 4294967296.000000, d: -nan)
12:02:45  xmm7 0000002000000000 (f: 0.000000, d: 6.790387e-313)
12:02:45  xmm8 ffffffff00000000 (f: 0.000000, d: -nan)
12:02:45  xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm10 ff000000000000ff (f: 255.000000, d: -5.486124e+303)
12:02:45  xmm11 3d2ef35793c76730 (f: 2479318784.000000, d: 5.497923e-14)
12:02:45  xmm12 be10000000000000 (f: 0.000000, d: -9.313226e-10)
12:02:45  xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  xmm14 3fe62e42fefa3800 (f: 4277811200.000000, d: 6.931472e-01)
12:02:45  xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
12:02:45  Module=/lib64/libpthread.so.0
12:02:45  Module_base_address=000014687DCF2000 Symbol=raise
12:02:45  Symbol_address=000014687DD01680
12:02:45  
12:02:45  Method_being_compiled=sun/nio/fs/UnixFileSystemProvider.newDirectoryStream(Ljava/nio/file/Path;Ljava/nio/file/DirectoryStream$Filter;)Ljava/nio/file/DirectoryStream;
12:02:45  Target=2_90_20230519_5292 (Linux 5.4.0-148-generic)
12:02:45  CPU=amd64 (4 logical CPUs) (0x1eca21000 RAM)
12:02:45  ----------- Stack Backtrace -----------
12:02:46  raise+0x2b (0x000014687DD016AB [libpthread.so.0+0xf6ab])
12:02:46  _ZN2TR4trapEv+0x47 (0x00001468767247A7 [libj9jit29.so+0x6267a7])
12:02:46  _ZN2TR9assertionEPKciS1_S1_z+0xc8 (0x00001468767249D8 [libj9jit29.so+0x6269d8])
12:02:46  _ZN3OMR3X8613CodeGenerator16doBinaryEncodingEv+0xa10 (0x0000146876B62BF0 [libj9jit29.so+0xa64bf0])
12:02:46  _ZN3OMR12CodeGenPhase26performBinaryEncodingPhaseEPN2TR13CodeGeneratorEPNS1_12CodeGenPhaseE+0x8a (0x00001468766A5C5A [libj9jit29.so+0x5a7c5a])
12:02:46  _ZN3OMR12CodeGenPhase10performAllEv+0xc9 (0x00001468766A7539 [libj9jit29.so+0x5a9539])
12:02:46  _ZN3OMR13CodeGenerator12generateCodeEv+0x63 (0x00001468766A3F33 [libj9jit29.so+0x5a5f33])
12:02:46  _ZN3OMR11Compilation7compileEv+0xc09 (0x00001468766D43F9 [libj9jit29.so+0x5d63f9])
12:02:46  _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x512 (0x00001468762535E2 [libj9jit29.so+0x1555e2])
12:02:46  _ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x369 (0x0000146876254839 [libj9jit29.so+0x156839])
12:02:46  omrsig_protect+0x1e3 (0x0000146877D9B8B3 [libj9prt29.so+0x2b8b3])
12:02:46  _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x336 (0x0000146876251B96 [libj9jit29.so+0x153b96])
12:02:46  _ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1e3 (0x0000146876252203 [libj9jit29.so+0x154203])
12:02:46  _ZN2TR24CompilationInfoPerThread14processEntriesEv+0x44c (0x0000146876250C2C [libj9jit29.so+0x152c2c])
12:02:46  _ZN2TR24CompilationInfoPerThread3runEv+0x98 (0x00001468762511D8 [libj9jit29.so+0x1531d8])
12:02:46  _Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x0000146876251272 [libj9jit29.so+0x153272])
12:02:46  omrsig_protect+0x1e3 (0x0000146877D9B8B3 [libj9prt29.so+0x2b8b3])
12:02:46  _Z21compilationThreadProcPv+0x1cf (0x000014687625169F [libj9jit29.so+0x15369f])
12:02:46  thread_wrapper+0x186 (0x0000146877B634F6 [libj9thr29.so+0xe4f6])
12:02:46  start_thread+0xd1 (0x000014687DCF9AA1 [libpthread.so.0+0x7aa1])
12:02:46  clone+0x6d (0x000014687D631C4D [libc.so.6+0xe8c4d])
12:02:46  ---------------------------------------
12:02:46  JVMDUMP039I Processing dump event "gpf", detail "" at 2023/05/19 16:02:45 - please wait.
12:02:46  JVMDUMP032I JVM requested System dump using '/home/jenkins/workspace/Build_JDK11_x86-64_linux_Personal/make/core.20230519.160245.84012.0001.dmp' in response to an event
@dylanjtuttle dylanjtuttle self-assigned this May 19, 2023
@dylanjtuttle
Copy link
Contributor Author

dylanjtuttle commented May 19, 2023

So if I'm interpreting this (and the source code) correctly, this assertion occurs during a stage of code generation in which generated code is converted into binary in preparation for execution or storage in the shared class cache.

The assertion appears inside a while loop that generates binary for all instructions after the interpreter entry point (which possibly means code that is interpreted rather than jitted?), and seems to be firing during this build because an estimation of the length of a binary encoded instruction (17) is shorter than the actual length of that binary encoded instruction (18).

I'm unsure of why an assert would be firing in this situation. If the binary instruction has already been generated, why does it matter that a previous prediction of its length was wrong? I can see how that might be useful information to help us create better estimates in the future, but it doesn't seem like something that we would want to halt the compiler for. Of course it is very likely that I don't actually understand what's really going on here (in fact, I'm not sure why this length estimation happens at all), but since either the assert or the code it's testing must be at fault, my best guess is that the assert is incorrect.

@0xdaryl
Copy link
Contributor

0xdaryl commented May 23, 2023

Generally, this kind of assert would appear if there is a mismatch between how many bytes of instructions we expected a method to consume versus how many it actually consumed when the binary was emitted (we need to "estimate" the size of the code buffer first in order to allocate space for it). Every instruction emitted conservatively estimates how many bytes it might need, and this should always be >= the number of bytes actually consumed. What this assert is saying is that for some instruction the running estimate got behind the number of actual bytes emitted (meaning we've started to overrun the code buffer).

It would be helpful if this assert actually printed the instruction address (i.e., cursorInstruction) rather than try and fabricate a name for it (which it is curiously unsuccessful at) because it will pinpoint which instruction is problematic. Perhaps you could modify the assert to also print the instruction address, and then re-run while generating a tracecg log (-Xjit:tracecg) for this method (sun/nio/fs/UnixFileSystemProvider.newDirectoryStream(Ljava/nio/file/Path;Ljava/nio/file/DirectoryStream$Filter;)Ljava/nio/file/DirectoryStream;)

Once you know the instruction where the problem first appeared you can inspect its estimateBinaryLength and generateBinaryEncoding functions for that instruction kind as a starting point.

@BradleyWood
Copy link
Member

An 18-byte instruction is unusually long, I'm not sure I've ever seen that before. Seems to be tripping up on a LEA instruction, which in fact, is not 18 bytes.

0x7ff456fd4bfd 000002fd [0x7ff41c372190] 42 8d 34 0d 00 00 00 00 lea esi, dword ptr [1*r9] # LEA4RegMem

@0xdaryl
Copy link
Contributor

0xdaryl commented May 23, 2023

Maximum instruction length on x86 is 15 bytes architecturally. Either something is clearly encoded wrong here, or this is a pseudo instruction (e.g., a vgnop) that is reporting a large length.

@BradleyWood
Copy link
Member

I dumped the buffer (18 bytes), looks like an address load instruction was inserted, which explains why so many bytes is required.

0:  49 ba 00 4c 18 00 00    movabs r10,0x184c00
7:  00 00 00
a:  42 8d 14 15 00 00 00    lea    edx,[r10*1+0x0]
11: 00

@dylanjtuttle You can start by looking at the estimation logic here.

@dylanjtuttle
Copy link
Contributor Author

Hi @BradleyWood, I'm a little confused by what you mean (I'm a new intern on the opt team and I'm still learning how the system works). Is this a situation in which an erroneous lea instruction is inserted where there should be none (in which case there is probably a bug in OMR::X86::MemoryReference::generateBinaryEncoding()), or are both of these instructions legitimate and OMR::X86::MemoryReference::estimateBinaryLength() isn't accounting for both of them for some reason?

@BradleyWood
Copy link
Member

The LEA instruction itself looks normal. However, sometimes at codegen, the memory reference cannot be encoded in the instruction. When this happens, another instruction is inserted to help load the base address of the memory reference. In this case the estimate of 17 bytes (should be at least 18) is too small to account for the address load instruction. You need to investigate why the estimate is too low.

@0xdaryl
Copy link
Contributor

0xdaryl commented May 24, 2023

To put what Brad said in more concrete terms, memory references can consolidate registers by inserting instructions. Here, for example: https://github.com/eclipse/omr/blob/8091cedda274b476e9e9d4da24128348a6c66aa4/compiler/x/codegen/OMRMemoryReference.cpp#L860

Did you get a log (even a partial one up to the point of assert) for the method this occurs in?

@dylanjtuttle
Copy link
Contributor Author

If I'm understanding the situation correctly, I don't think I can run this particular method (at least with PROD_WITH_ASSUMES enabled) because since this assert happens while the code is building, so I don't end up with a java executable that I can run the method with. So far I've just been building the code repeatedly until the assert happens. Is there a way to get a log from that?

@BradleyWood
Copy link
Member

You can guard the assert with an environment variable to get a build.

feGetEnv("TR_ASSERT") == NULL || cursorInstruction->getEstimatedBinaryLength() >= self()->getBinaryBufferCursor() - instructionStart

After your build, you can try to get a log of one of the crashing methods by defining the variable.

export TR_ASSERT=1

@dylanjtuttle
Copy link
Contributor Author

That's really helpful, thank you! I'll give that a try.

@dylanjtuttle
Copy link
Contributor Author

I think I'm beginning to spin my tires a little, so I thought it would be a good time to ask for some more help. I've inserted a variety of debug print statements in different places to try and pin down where the estimation goes wrong, but none of them print the instruction address/length/estimate that actually triggers the assert. I have theories about why that might be, but ultimately I don't think getting the correct print statements would tell me enough to figure out where the problem is anyway, so I've sort of abandoned this approach. For the record, I have figured out how to get the logs working thanks to some help from @bhavanisn, so that's nice!

I've also been trying to go through the estimation logic to see how we estimate the length for LEA instructions specifically. I looked through OMR::X86::AMD64::MemoryReference::estimateBinaryLength() and OMR::X86::MemoryReference::estimateBinaryLength(), and the latter has a switch statement on addressTypes, which allows us to make different estimations for different kinds of instructions. It looks like addressTypes might be a bit vector, quantifying things like whether it uses a base or index register, but doesn't seem to have much to do with the instruction type itself. Inside some of the switch cases there are calls to check if the base register (if there is one) needs a displacement field or a SIB byte, which also doesn't seem to have much to do with the instruction type.

At the same time, I've been trying to learn how the code generation for LEA works and how it decides when to add the extra MOV. I feel like I'm beginning to understand it, but I'm still confused because I can't find any corresponding logic on the estimation side. Are we failing this assert because the estimation logic simply doesn't account for this case? Or does it account for it somewhere (incorrectly) and I haven't been able to find it yet? I've noticed that there are a lot of other methods called estimateBinaryLength() throughout the codebase, but printing their return values gives me a lot of triple or quadruple digit values, so I don't think they are the answer either.

@BradleyWood
Copy link
Member

As we discussed, the estimates produced for the memory reference are correct at 15 bytes. It looks like count is short because it does not account for the REX prefix. I verified this be checking the calculated rex bits at the time of estimation, which happened to be 0. Obviously, the instruction has a rex prefix, so the key thing to investigate is why those bits are 0, and not correctly calculated when the length of the LEA instruction is estimated. I suspect it is a side effect of inserting the address load instruction.

@BradleyWood
Copy link
Member

I suspect that the problem is here. _indexRegister changes at the time of code-generation which requires the REX prefix. At this point, the estimate has already been calculated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants