Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OffHeap CardMarking using the baseObj instead of dataAddr #20264

Merged
merged 4 commits into from
Dec 11, 2024

Conversation

rmnattas
Copy link
Contributor

@rmnattas rmnattas commented Sep 30, 2024

When evaluating Unsafe.CAS, the card marking uses the dstObj to calculate the card entry to dirty.

With OffHeap the dstObj is a dataAddrPointer load of the baseObj:

n85n      NULLCHK on n87n [#32]
n86n        icall  jdk/internal/misc/Unsafe.compareAndSetObject(Ljava/lang/Object;JLjava/lang/Object;Ljava/lang/Object;)Z[#415  final native virtual Method -304]
n87n          aload  java/util/concurrent/ConcurrentHashMap.U Ljdk/internal/misc/Unsafe;[#407  final Static]
n92n          aloadi  <contiguousArrayDataAddrFieldSymbol>[#352  final Shadow +16] 
n88n            aload  tab<parm 0 [Ljava/util/concurrent/ConcurrentHashMap$Node;>[#403  Parm]
n13n          ladd                                 
n9n             lshl                               
n6n               i2l                              
n5n                 iload  i<parm 1 I>[#404  Parm] 
n7n               iconst 3 (X!=0 X>=0 )            
n12n            i2l                                
n10n              iconst 0 (X==0 X>=0 X<=0 )         
n90n          aload  c<parm 2 Ljava/util/concurrent/ConcurrentHashMap$Node;>[#405  Parm]
n91n          aload  v<parm 3 Ljava/util/concurrent/ConcurrentHashMap$Node;>[#406  Parm]

This PR changes that to use the correct baseObj to calculate the card entry to dirty.
The change uses the temp registers to not use more registers for the code sequence.

TODO:

  • Z
    • inlineConcurrentLinkedQueueTMOffer (Given that the dstObj is not an array, no dataAddr load will be present)
    • inlineConcurrentLinkedQueueTMPoll (Given that the dstObj is not an array, no dataAddr load will be present)
  • Confirm that arraycopyEvaluator needs the change on all platforms [1].

Depends on eclipse-omr/omr#7562


[1]
For arraycopy, currently the optimizer uses the baseObj as the dstObj child, and loads the dataAddr in the dstAddr child. As card marking uses the dstObj child for the card entry calculation, it's correct. I just need to confirm that this is always the case.

n60n        arraycopy  java/lang/System.arraycopy(Ljava/lang/Object;ILjava/lang/Object;II)V[#405  unresolved notAccessed static Method] [flags 0x400 0x0 ] (Unsigned forwardArrayCopy noArra>
n58n          aload  <parm 0 [Ljava/lang/Object;>[#403  Parm]                          
n59n          aload  <parm 1 [Ljava/lang/Object;>[#404  Parm]                            
n54n          aloadi  <contiguousArrayDataAddrFieldSymbol>[#352  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [    0x78f0fb0e00a0] bci=[-1,0,20] rc=1 vc=33 vn=- li=- udi=- nc=1 fl>
n55n            aload  <parm 0 [Ljava/lang/Object;>[#403  Parm]                
n56n          aloadi  <contiguousArrayDataAddrFieldSymbol>[#352  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [    0x78f0fb0e0140] bci=[-1,2,20] rc=1 vc=33 vn=- li=- udi=- nc=1 fl>
n57n            aload  <parm 1 [Ljava/lang/Object;>[#404  Parm]                         
n53n          lconst 8 (highWordZero X!=0 X>=0 cannotOverflow )             

@rmnattas
Copy link
Contributor Author

fyi @zl-wang @hzongaro @r30shah @knn-k

@r30shah
Copy link
Contributor

r30shah commented Sep 30, 2024

inlineConcurrentLinkedQueueTM* exploitation is limited to IBM Java 8, I do not think we would have CAS for that, we need to take a look at the tmOffer and tmCall node for that.

@rmnattas
Copy link
Contributor Author

@r30shah, I don't have a IBM Java 8 with OffHeap to test, but it seem that its not an issue given that the dstObject for inlineConcurrentLinkedQueueTM* is not an array.

The dataAddr load is for arrays, if the dstObject is a ConcurrentLinkedQueue then it won't have a dataAddr load, hence no change necessary, correct?

n269n       icall  java/util/concurrent/ConcurrentLinkedQueue.tmOffer(Ljava/util/concurrent/ConcurrentLinkedQueue$Node;)I[#451  native special Method]
n2405n        aRegLoad GPR8   <parm 0 Ljava/util/concurrent/ConcurrentLinkedQueue;>[#399  Parm]
n255n         new  jitNewObject[#91  helper Method] 
n254n           loadaddr  java/util/concurrent/ConcurrentLinkedQueue$Node[#441  Static] 
Full Instruction Selection
------------------------------
 n270n    (  0)  treetop ()                                                                           [     0x3ff85ab3420] bci=[0,22,346] rc=0 vc=6735 vn=- li=51 udi=- nc=1 flg=0x8
 n269n    (  3)    icall  java/util/concurrent/ConcurrentLinkedQueue.tmOffer(Ljava/util/concurrent/ConcurrentLinkedQueue$Node;)I[#451  native special Method] [flags 0x500 0x0 ] (in GPR_0191) ()  [     0x3ff85ab33d0] bci=[0,22,346] rc=3 vc=6735 vn=- li=51 udi=- nc=2 flg=0x8
 n2405n   (  4)      ==>aRegLoad (in &GPR_0112) (X!=0 X>=0 SeenRealReference )
 n255n    (  3)      ==>new (in &GPR_0146) (highWordZero Unsigned X!=0 allocationCanBeRemoved )
------------------------------

 [     0x3ff86717850]                          LHI     GPR_0191,0x1     
 [     0x3ff86717ab0]                          TBEGINC #629 0,0xff00
 [     0x3ff86717b90]                          NIAI    1,0      
 [     0x3ff86717d20]                          LLGF    &GPR_0192,#630 12(&GPR_0112)
 [     0x3ff86717df0]                          NIAI    1,0      
 [     0x3ff86717f80]                          LT      &GPR_0193,#631 12(&GPR_0192)
 [     0x3ff86718050]                          LLGFR   &GPR_0193,&GPR_0193      
 [     0x3ff86718130]                          BRC     BH(0x8), Label L0119     
 [     0x3ff86718210]                          LGR     &GPR_0192,&GPR_0193      
 [     0x3ff867182f0]                          NIAI    1,0      
 [     0x3ff86718480]                          LT      &GPR_0193,#632 12(&GPR_0192)
 [     0x3ff86718550]                          LLGFR   &GPR_0193,&GPR_0193      
 [     0x3ff86718630]                          BRC     BH(0x8), Label L0119     
 [     0x3ff86718710]                          BRC     NOP(0xf), Label L0120    
 [     0x3ff867187f0]                          Label L0119:     
 [     0x3ff86718990]                          ST      &GPR_0146,#633 12(&GPR_0192)
 [     0x3ff86718b20]                          ST      &GPR_0146,#634 12(&GPR_0112)
 [     0x3ff86718bf0]                          XR      GPR_0191,GPR_0191        
 [     0x3ff86719290]                          assocreg
 [     0x3ff86718cd0]                          Label L0120:     
 POST:
 {AssignAny:GPR_0191:R} {AssignAny:&GPR_0112:R} {AssignAny:&GPR_0192:R} {AssignAny:&GPR_0193:R} {AssignAny:&GPR_0146:R}
 [     0x3ff86719420]                          TEND    #635 0(GPR0)
 [     0x3ff86719720]                          LTR     GPR_0191,GPR_0191        
 [     0x3ff86719800]                          BRC     BNH(0x6), Label L0122    
 [     0x3ff86719ac0]                          LGR     GPR_0194,&GPR_0192       
 [     0x3ff86719ba0]                          SG      GPR_0194,#636 208(GPR13)
 [     0x3ff86719c70]                          CLG     GPR_0194,#637 216(GPR13)
 [     0x3ff86719d40]                          BRC     BL(0x2), Label L0122     
 [     0x3ff86719fa0]                          TM      #638 413(GPR13), 0x10
 [     0x3ff8671a080]                          BRC     BH(0x8), Label L0125     
 [     0x3ff8671a160]                          SRLG    GPR_0194,GPR_0194,9
 [     0x3ff8671a300]                          AG      GPR_0194,#639 192(GPR13)
 [     0x3ff8671a490]                          MVI     #640 0(GPR_0194), 0x01
 [     0x3ff8671ab30]                          assocreg
 [     0x3ff8671a570]                          Label L0125:     
 POST:
 {GPR1:&GPR_0192:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0194:R} {GPR14:GPR_0195:R} {AssignAny:GPR_0191:R}
 [     0x3ff8671b1c0]                          assocreg
 [     0x3ff8671ac00]                          Label L0124:     
 POST:
 {GPR1:&GPR_0192:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0194:R} {GPR14:GPR_0195:R} {AssignAny:GPR_0191:R}
 [     0x3ff8671b350]                          TM      #641 3(&GPR_0192), 0xf0
 [     0x3ff8671b430]                          BRC     MASK9(0x8), Snippet Label L0123  
 [     0x3ff8671bb40]                          assocreg
 [     0x3ff8671b580]                          Label L0122:     
 POST:
 {GPR1:&GPR_0192:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0194:R} {GPR14:GPR_0195:R} {AssignAny:GPR_0191:R}
 [     0x3ff8671be30]                          LTR     GPR_0191,GPR_0191        
 [     0x3ff8671bf10]                          BRC     BNH(0x6), Label L0126    
 [     0x3ff8671c1d0]                          LGR     GPR_0196,&GPR_0112       
 [     0x3ff8671c2b0]                          SG      GPR_0196,#642 208(GPR13)
 [     0x3ff8671c380]                          CLG     GPR_0196,#643 216(GPR13)
 [     0x3ff8671c450]                          BRC     BL(0x2), Label L0126     
 [     0x3ff8671c6b0]                          TM      #644 413(GPR13), 0x10
 [     0x3ff8671c790]                          BRC     BH(0x8), Label L0129     
 [     0x3ff8671c870]                          SRLG    GPR_0196,GPR_0196,9
 [     0x3ff8671ca10]                          AG      GPR_0196,#645 192(GPR13)
 [     0x3ff8671cba0]                          MVI     #646 0(GPR_0196), 0x01
 [     0x3ff8671d240]                          assocreg
 [     0x3ff8671cc80]                          Label L0129:     
 POST:
 {GPR1:&GPR_0112:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0196:R} {GPR14:GPR_0197:R} {AssignAny:GPR_0191:R}
 [     0x3ff8671d8d0]                          assocreg
 [     0x3ff8671d310]                          Label L0128:     
 POST:
 {GPR1:&GPR_0112:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0196:R} {GPR14:GPR_0197:R} {AssignAny:GPR_0191:R}
 [     0x3ff8671da60]                          TM      #647 3(&GPR_0112), 0xf0
 [     0x3ff8671db40]                          BRC     MASK9(0x8), Snippet Label L0127  
 [     0x3ff8671e250]                          assocreg
 [     0x3ff8671dc90]                          Label L0126:     
 POST:
 {GPR1:&GPR_0112:R} {GPR2:&GPR_0146:R} {GPR4:GPR_0196:R} {GPR14:GPR_0197:R} {AssignAny:GPR_0191:R}

@rmnattas
Copy link
Contributor Author

The arraycopy transformations uses the correct base object in the dstObj node instead of the dataAddrPtr load. To guard that I have added asserts to check the dstObj node to not be a dataAddrPtr symbol.

@rmnattas
Copy link
Contributor Author

Doing final testing but PR is ready for review

@rmnattas rmnattas marked this pull request as ready for review October 24, 2024 13:41
@rmnattas
Copy link
Contributor Author

rmnattas commented Nov 1, 2024

PR ready for review and merge
ArrayCopy assert was moved to a separate PR #20480

runtime/compiler/z/codegen/J9TreeEvaluator.cpp Outdated Show resolved Hide resolved
@rmnattas rmnattas marked this pull request as draft November 1, 2024 14:36
When evaluating Unsafe.CAS while running with Balanced GC and OffHeap
enabled, if the object child is the dataAddrPointer load, pass the
baseObj to VMCardCheckEvaluator for correct card marking.

The dstReg and temp2Reg in VMCardCheckEvaluator can share a reg.

Signed-off-by: Abdulrahman Alattas <[email protected]>
When running with Balanced GC and OffHeap enabled, if the
destOwningObject is a dataAddrPointer load, use the baseObj
as the owningObjectReg.

Signed-off-by: Abdulrahman Alattas <[email protected]>
When evaluating Unsafe.CAS while running with Balanced GC and OffHeap
enabled, if the object child is the dataAddrPointer load, pass the
baseObj to VMCardCheckEvaluator for correct card marking.

The dstReg and temp2Reg in VMCardCheckEvaluator can share a reg,
adding the argument clobberDstReg to indicate if dstReg can be used
as the temp2Reg.

Not allocating temp2Reg and adding a dep for baseObjReg uses the same
number of registers.

Signed-off-by: Abdulrahman Alattas <[email protected]>
When evaluating Unsafe.CAS while running with Balanced GC and OffHeap
enabled, if the object child is the dataAddrPointer load, pass the
baseObj to VMCardCheckEvaluator for correct card marking.

Using the VMCardCheckEvaluator ability to clobberDstReg, we use
baseObjReg instead of the temp epReg.

Signed-off-by: Abdulrahman Alattas <[email protected]>
@rmnattas
Copy link
Contributor Author

rmnattas commented Dec 8, 2024

PR set WIP as the case of a temp/aRegLoad node storing the dataAddrPtr internal-pointer used as the destination object came to attention, in that case we won't be able to get the baseAddr of that easily. Hence that was dealt with first by disabling the only case currently that commons the dataAddrPtr into a temp, to be dealt with in the future.

This PR is ready but depends on eclipse-omr/omr#7562

@rmnattas rmnattas closed this Dec 8, 2024
@rmnattas rmnattas reopened this Dec 8, 2024
@rmnattas rmnattas marked this pull request as ready for review December 8, 2024 19:07
@zl-wang
Copy link
Contributor

zl-wang commented Dec 10, 2024

Jenkins test sanity.functional xlinux,plinux,aix,zlinux,alinux64 jdk21

@zl-wang zl-wang merged commit cdc1adc into eclipse-openj9:master Dec 11, 2024
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants