Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate ArraysSupport.vectorizedMismatch in IL #16662

Merged

Conversation

Spencer-Comin
Copy link
Contributor

@Spencer-Comin Spencer-Comin commented Feb 2, 2023

This patch adds ArraysSupport.vectorizedMismatch as a recognized method. If the arraycmp IL opcode is supported, vectorizedMismatch call nodes are transformed to a functionally equivalent tree using arraycmp.

vectorizedMismatch is functionally equivalent to:

   lengthInBytes = length << log2ArrayIndexScale
   mask = (log2ArrayIndexScale<2) ? 3 : 7
   n = lengthInBytes & ~(mask)
   res = arrayCmpLen(a+aOffset, b+bOffset, n)
   if (res == n)  // no mismatch found
      return ~((lengthInBytes & mask) >> log2ArrayIndexScale)
   else           // mismatch found
      return res >> log2ArrayIndexScale

Therefore the call nodes can be translated like so:

n9n    icall  jdk/internal/util/ArraysSupport.vectorizedMismatch(Ljava/lang/Object;JLjava/lang/Object;JII)I
n3n      <a>
n4n      <aOffset>
n5n      <b>
n6n      <bOffset>
n7n      <length>
n8n      <log2ArrayIndexScale>

becomes

n9n         iselect ()
n31n          lcmpeq
n22n            arraycmp (arrayCmpLen )
n23n              aladd
n3n                 <a>
n4n                 <aOffset>
n24n              aladd
n5n                 <b>
n6n                 <bOffset>
n21n              land
n14n                lshl
n13n                  i2l
n7n                     <length>
n12n                  i2l
n8n                     <log2ArrayIndexScale>
n20n                lxor
n18n                  lor
n16n                    lshl
n12n                      ==>i2l
n15n                      lconst 1
n17n                    lconst 3
n19n                  lconst -1
n21n            ==>land
n29n          ixor
n27n            l2i
n26n              lshr
n25n                land
n14n                  ==>lshl
n18n                  ==>lor
n12n                ==>i2l
n28n            iconst -1
n30n          ishr
n22n            ==>arraycmp
n8n             <log2ArrayIndexScale>

ArraysSupport.vectorizedMismatch is used in the implementations of Arrays.mismatch, Arrays.compare, and Arrays.equals for primitive types.

Benchmarking Arrays.compare on x86, Power, and Z for 128, 4096, and 8192 element arrays of all primitive types shows a speedup of around 4.2× on x86, 2.7× on Power, and 4.1× on Z. Benchmark arrays of 1 or 2 elements shows no significant slowdown, and in some cases a speedup (notably, I saw a speedup of 3× comparing two element arrays of long).
Full benchmarking results here, for the curious: arraycompare-results.zip

Fixes: #15204

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from 53e32e6 to 256dd74 Compare February 2, 2023 21:54
Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Spencer-Comin, Can you upload a sample log file printing trees before and after this particular optimization?

@Spencer-Comin
Copy link
Contributor Author

Spencer-Comin commented Feb 6, 2023

@Spencer-Comin, Can you upload a sample log file printing trees before and after this particular optimization?

vectorizedMismatch.log.txt

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch 2 times, most recently from ff8662b to 693be7e Compare February 7, 2023 22:10
@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from 693be7e to 3fbd0ea Compare February 9, 2023 15:14
@@ -134,6 +134,12 @@ J9::X86::CodeGenerator::initialize()
cg->setSupportsBDLLHardwareOverflowCheck();
}

static char *disableInlineVectorizedMismatch = feGetEnv("TR_disableInlineVectorizedMismatch");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xdaryl I've only considered arraycmp support in enabling this acceleration on x. Is there anything else to take into consideration here?

Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks OK to me apart from couple of minor nitpicks in comments. @0xdaryl Can I request you to give your review as well to this change?

runtime/compiler/optimizer/J9RecognizedCallTransformer.hpp Outdated Show resolved Hide resolved
runtime/compiler/z/codegen/J9CodeGenerator.cpp Outdated Show resolved Hide resolved
@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from 3fbd0ea to f48cfcc Compare February 16, 2023 14:13
@r30shah
Copy link
Contributor

r30shah commented Feb 16, 2023

@Spencer-Comin Do we have any functional tests for these changes ? OR we should contribute one ?

@Spencer-Comin
Copy link
Contributor Author

Nothing I see in OpenJ9. There are JDK tests that use Arrays.mismatch and Arrays.compare.

$ grep -rnw `pwd` -e 'Arrays\.compare'
jdk11/test/jdk/java/net/httpclient/http2/server/Http2TestServerConnection.java:309:        if (Arrays.compare(clientPreface, bytes) != 0) {
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:755:        Assert.assertTrue(Arrays.compare(a, b) < 0);
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:756:        Assert.assertTrue(Arrays.compare(b, a) > 0);
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:1037:                        testNPE(() -> Arrays.compare(o1, o2, o3));
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:1042:                    testNPE(() -> Arrays.compare(o1, 0, 0, o2, 0, 0, o3));
jdk11/test/jdk/java/nio/file/Files/ReadWriteString.java:218:        assertTrue((Arrays.compare(bytes, bytes2) == 0), "The bytes should be the same");
jdk11/test/jdk/com/sun/crypto/provider/Cipher/DES/DESKeyCleanupTest.java:70:        } while (Arrays.compare(zeros, array) != 0);
jdk11/test/jdk/com/sun/crypto/provider/Cipher/PBE/PBEKeyCleanupTest.java:94:        } while (Arrays.compare(zeros, array) != 0);
jdk11/test/lib/jdk/test/lib/OSVersion.java:89:        return Arrays.compare(this.versionTokens, o.versionTokens);
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java:434:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java:446:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java:456:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java:468:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java:357:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java:367:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java:377:        if (Arrays.compare(oldVal, newVal) != 0) {
jdk11/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java:387:        if (Arrays.compare(oldVal, newVal) != 0) {

$ grep -rnw `pwd` -e 'Arrays\.mismatch'
jdk11/test/jdk/tools/jmod/hashes/HashesOrderTest.java:150:                int i = Arrays.mismatch(buffer1, 0, nRead1, buffer2, 0, nRead2);
jdk11/test/jdk/tools/jlink/JLinkReproducibleTest.java:134:                int i = Arrays.mismatch(buffer1, 0, nRead1, buffer2, 0, nRead2);
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:1038:                        testNPE(() -> Arrays.mismatch(o1, o2, o3));
jdk11/test/jdk/java/util/Arrays/ArraysEqCmpTest.java:1043:                    testNPE(() -> Arrays.mismatch(o1, 0, 0, o2, 0, 0, o3));
jdk11/test/jdk/java/rmi/testlibrary/TestSocketFactory.java:832:            long index = Arrays.mismatch(actual, expected);
jdk11/test/jdk/javax/xml/crypto/dsig/Versions.java:56:                int i = Arrays.mismatch(buffer1, 0, nRead1, buffer2, 0, nRead2);
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressBooleanArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressBooleanArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressBooleanArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressBooleanArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressIntArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressIntArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressIntArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressIntArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressLongArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressLongArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressLongArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressLongArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressCharArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressCharArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressCharArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressCharArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressObjectArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressObjectArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressObjectArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressObjectArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressByteArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressByteArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressByteArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressByteArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressFloatArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressFloatArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressFloatArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressFloatArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressShortArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressShortArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressShortArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressShortArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressDoubleArrayCopy.java:41:            int m = Arrays.mismatch(test, 0, size,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressDoubleArrayCopy.java:53:            int m = Arrays.mismatch(test, r, r+len,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressDoubleArrayCopy.java:62:            int m = Arrays.mismatch(test, 0, r,
jdk11/test/hotspot/jtreg/compiler/arraycopy/stress/StressDoubleArrayCopy.java:69:            int m = Arrays.mismatch(test, r + len, size,

@r30shah
Copy link
Contributor

r30shah commented Feb 24, 2023

@0xdaryl Can we get your review on these changes?

@r30shah
Copy link
Contributor

r30shah commented Mar 7, 2023

@0xdaryl / @jdmpapin Can I request your review on this change from @Spencer-Comin ?

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from f48cfcc to c27af73 Compare March 15, 2023 14:30
@Spencer-Comin
Copy link
Contributor Author

With aarch64 now supporting arraycmp in eclipse-omr/omr#6904 I've enabled this acceleration there too. @Akira1Saitoh could you review this?

@Akira1Saitoh
Copy link
Contributor

@Spencer-Comin Thanks. Your change looks good to me.

Copy link
Contributor

@0xdaryl 0xdaryl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor nits. Generally looks ok.

runtime/compiler/codegen/J9CodeGenerator.hpp Outdated Show resolved Hide resolved
runtime/compiler/codegen/J9CodeGenerator.hpp Outdated Show resolved Hide resolved
@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from c27af73 to 5b0c361 Compare March 22, 2023 14:14
@0xdaryl
Copy link
Contributor

0xdaryl commented Mar 22, 2023

Jenkins test sanity all jdk17

@Spencer-Comin Spencer-Comin marked this pull request as draft April 14, 2023 19:42
@Spencer-Comin
Copy link
Contributor Author

Converting this to draft while we resolve the size restriction on arraycmp.
See eclipse-omr/omr#6951 for issue tracking progress on that, and #16662 (comment) for context on why this issue blocks this PR.

@Spencer-Comin
Copy link
Contributor Author

I've rewritten this to use arraycmplen, so this PR is now dependent on eclipse-omr/omr#6983 and indirectly #17382

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from 1ff5372 to dd490e9 Compare August 29, 2023 17:26
Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already reviewed this, so just went through again, looks good to me. @jdmpapin as the PR for arrayCmpLen will be merged in coming days, requesting your review on this one.

runtime/compiler/optimizer/J9RecognizedCallTransformer.cpp Outdated Show resolved Hide resolved
@@ -124,6 +124,12 @@ J9::Z::CodeGenerator::initialize()
cg->setSupportsInlineEncodeASCII();
}

static bool disableInlineVectorizedMismatch = feGetEnv("TR_disableInlineVectorizedMismatch") != NULL;
if (cg->getSupportsArrayCmpLen() && !disableInlineVectorizedMismatch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSupportsArrayCmpLen would be disabled for arraylets and off-heap, right? Just confirming.

Copy link
Contributor Author

@Spencer-Comin Spencer-Comin Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eclipse-omr/omr#6983 enables arraycmplen in the same way arraycmp is enabled on each platform:

  • AArch64 - disabled for arraylets, also disabled by environment variable TR_aarch64DisableArrayCmpLen
  • Power - always enabled
  • x86 - disabled for arraylets
  • Z - always enabled

arraycmplen should probably be uniformly disabled for arraylets, and maybe by a single environment variable for all platforms. Same for arraycmp. I'll add a commit to eclipse-omr/omr#6983 to make arraycmp enablement more uniform and amend the commit introducing arraycmplen. Edit: To not put another obstacle to the beleaguered arraycmplen PRs, I'll make this change in its own PR (eclipse-omr/omr#7108).

Not sure how off-heap will affect arraycmp(len) enablement, @VermaSh any input?

Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @jdmpapin Would appreciate if you can give your review on these changes as well and if OK, merge it.

@r30shah
Copy link
Contributor

r30shah commented Sep 19, 2023

@Spencer-Comin seems like there has been merge conflicts. Can you resolve them ?

@jdmpapin As arraycmp opcode changes have been merged, Can we get your review on these changes?

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from 517f288 to de7ee82 Compare September 19, 2023 17:02
@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from de7ee82 to dcc15e3 Compare September 19, 2023 19:42
@jdmpapin
Copy link
Contributor

Thanks! I suppose I should have also mentioned the places in the comment that show 64-bit shift amounts, since those are now inconsistent with the implementation

@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from dcc15e3 to b83f994 Compare September 19, 2023 19:50
This patch adds ArraysSupport.vectorizedMismatch as a recognized method, and
adds a SupportsInlineVectorizedMismatch flag to the code generator. This flag
is set in Z, Power, aarch64 and x86 code generator initialization if the
arraycmp opcode is supported and the TR_disableInlineVectorizedMismatch
environment variable is not set. If the flag is set, vectorizedMismatch call
nodes are transformed to a functionally equivalent tree that uses arraycmp.

Fixes: eclipse-openj9#15204
Signed-off-by: Spencer Comin <[email protected]>
@Spencer-Comin Spencer-Comin force-pushed the vectorizedMismatch-iselect branch from b83f994 to 0636989 Compare October 25, 2023 17:21
@Spencer-Comin
Copy link
Contributor Author

I've resolved the conflict, @jdmpapin is there anything left to do before this PR can begin the merge process?

@jdmpapin
Copy link
Contributor

Due to the state of PR builds, I'm coordinating with Spencer to test this PR privately

@jdmpapin
Copy link
Contributor

This has now passed JDK17 sanity.functional on Linux with x86-64, PPC64LE, z, and AArch64, with the exception of a number of unrelated CRIU test failures caused by machine configuration. Merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panama MemoryAccessAPI: Add jit intrinsic support for vectorizedMismatch
8 participants