Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueTypes: Add support for reference projection arrays #14858

Closed
tajila opened this issue Apr 5, 2022 · 18 comments
Closed

ValueTypes: Add support for reference projection arrays #14858

tajila opened this issue Apr 5, 2022 · 18 comments
Assignees
Labels
comp:vm project:valhalla Used to track Project Valhalla related work

Comments

@tajila
Copy link
Contributor

tajila commented Apr 5, 2022

In the current model, a given primitive type can have two types, the primary type and the reference projection (AKA the box).

Using Point as an example:

public primitive class Point {
...
}
Point p = new Point(..); //primary
Point.ref pRef = null; //reference projection
pRef = p;

As a result there are two types of arrays, Point[] (the primary type array) and Point.ref[] (the reference projection) they are allocated with classrefs "QPoint;" and Point respectively. Note that the first one is a signature and the second is a simple name.

It is possible to instantiate new Point.ref[size] but not new Point.ref(...)

Point[] p = new Point[size]; //primary
Point.ref[] pRef = new Point.ref[size]; //reference projection

This will require updates to our classfile parsers and potentially romclass builders

See http://cr.openjdk.java.net/~dlsmith/jep401/jep401-20211220/specs/primitive-classes-jvms.html#jvms-4.4.1 for java class file encoding

In the JVM we will need two J9Class'es, one for the primitive type, and the other for the value reference projection. I'm not sure that we will need two ROM classes, but thats something we will need to think about.

@tajila tajila added comp:vm project:valhalla Used to track Project Valhalla related work labels Apr 5, 2022
@tajila
Copy link
Contributor Author

tajila commented Apr 5, 2022

@gacholio any thoughts on how to approach this?

We need to make a distinction in the constant pool between QPoint; and Point. Which may also imply that we need to be able to resolveClassRef with QPoint; and Point and get a different result.

Also we need generate a two ramClass'es for the Point[] and Point.ref[] variants.

@tajila
Copy link
Contributor Author

tajila commented Apr 5, 2022

FYI @DanHeidinga

@tajila
Copy link
Contributor Author

tajila commented Apr 5, 2022

We could generate a companion class for primitive Objects, similar to how we have clazz->arrayClass created by internalCreateArrayClass, something like clazz->refProjectionArray via internalCreateRefProjectionArrayClass. But we still need a way to tell in the constantPool whether we are dealing QPoint; or Point.

The alternative, is to start using signatures instead of simple names internally in the classtables, that way we can name QPoint; and Point, and they would be completely separate types. We would make it illegal to instantiate Point but it would be legal to call anewarray on it.

@gacholio
Copy link
Contributor

gacholio commented Apr 6, 2022

I don't have enough context for a useful answer. At a glance this seems like another in a long line of hacks to shoehorn this arguably unnecessary feature into a language that's clearly not designed for it.

@hangshao0
Copy link
Contributor

We will need 2 ramClasses for primitive value types, one for the L type and one for the Q type. They could have a pointer pointing to each other. Likely we won't need 2 rom classes. For the Q type ramClass, it can be keyed on QClassName;.

@tajila
Copy link
Contributor Author

tajila commented Apr 18, 2022

Currently we dont put signatures in classtables so that would be a new change.

Also note that ACONST_INIT for a primitive type will not use the qsignature it will just use the simple name. It is only ANEWARRAY that makes a distinction between the qsignature and the simple name.

@tajila
Copy link
Contributor Author

tajila commented Apr 18, 2022

@gacholio

I don't have enough context for a useful answer.

The problem we are facing is that with a single definition of a class, primitive class Point, two runtime representations of that class are generated: (1) the primitive object, and (2) the reference projection. Given that there are two representations we need two ramClasses to make the distinction.

This problem is made worse due to the fact that there isnt a consistent way to refer to the type in the constant pool.

When creating an instance of the primitive object one uses a classref with the simple name Point. When creating an array of the primitive object one uses the classref with the Qsignature QPoint;.

It is illegal to instantiate the reference projection, however it is possible to create an array with it, and that is done with the simple name Point.

So the questions are:

  1. how do we represent two different runtime types with the same name in the VM.
  2. how do we represent classrefs with signatures in the constant pool

@hangshao0
Copy link
Contributor

Also note that ACONST_INIT for a primitive type will not use the qsignature it will just use the simple name.

Noticed the following in the current JEP:

A CONSTANT_Class constant pool entry may refer to a primitive type using a Q descriptor as a "class name". A CONSTANT_Class using the plain name of a primitive class represents the class's reference type.

The aconst_init instruction may refer to either a primitive type or a reference type. This determines whether a primitive value or a value object is produced.

@tajila
Copy link
Contributor Author

tajila commented Apr 18, 2022

That wording is inconsistent with the current behaviour of the RI prototype.

In any case, if that is true, we still need to update the internal representation in constant pools and classTables.

@gacholio
Copy link
Contributor

gacholio commented Apr 19, 2022

Given the comments above, would it make sense to add a flag to class table queries to indicate which version of the type is requested?

  • current behaviour - input would be LPoint;
  • undecorated lookup of L type - input would be Point
  • undecorated lookup of Q type - input would be Point
  • more if needed

My suggestion would be to leave ROM classes alone, and not put two classes in the table (i.e. what we have today). The query flags can be used to determine how to match the class in the table (knowing only fully-qualified L names are in there), and whether or not to fetch the Q version from the L version.

Does this make sense?

@hangshao0
Copy link
Contributor

Unlike other GC policies that iterate over both class table and class segments when unloading classes, I remember balanced GC only iterate over the class table (As a result, hidden classes (keyed on rom address) are put into the class table even though class table queries won't find them). I guess we still need to add both ram classes here into the class table to avoid causing problems for balanced GC.

@gacholio
Copy link
Contributor

Or, follow the Q pointer from classes.

@hangshao0
Copy link
Contributor

Or, follow the Q pointer from classes.

You mean change GC to follow the Q pointer when unloading classes ?

@gacholio
Copy link
Contributor

You mean change GC to follow the Q pointer when unloading classes ?

Right (technically, when marking classes).

@gacholio
Copy link
Contributor

Assuming we don't change the ROM class format, and that the ROM classes are shared between the two RAM classes, we could still put both classes in the table (provided the Q and L ones have a bit somewhere that distinguishes them).

It still makes sense to me to update the API to specify the input format and desired output with a bit field where value 0 is the current behaviour. Possibly change the implementation to forward the correct bit value onto a new function which would be called from the new cases.

@gacholio
Copy link
Contributor

The class table is peeked by exception backtrace decode based on the ROM class of the PC in the walkback array, so whichever solution we go with, we need to make sure the "real" class is found by that peek.

hangshao0 added a commit to hangshao0/openj9 that referenced this issue May 23, 2023
1. There are 2 ram classes generated from a primitive value type. One is
the L type ram class and the other is the Q type ram class. Add fields
Ltype and Qtype into J9Class. No non-primitive value classes, these 2
fields points to the J9Class itself.
2. For primitive VT, only the Q type can be instanciated. Embed the L
type J9Class header in the Q type. The different behaviours of L type
and Q type can be controlled by J9Class->classFlags.
3. Add new flag J9_FINDCLASS_FLAG_QTYPE to indicate if it is a Q type
that is to be found or generated.

issue eclipse-openj9#14858

Signed-off-by: Hang Shao <[email protected]>
@hangshao0
Copy link
Contributor

This is from an older version of the spec. It can be closed.

Copy link

github-actions bot commented Oct 3, 2024

Issue Number: 14858
Status: Closed
Actual Components: comp:vm, project:valhalla
Actual Assignees: No one :(
PR Assignees: hangshao0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm project:valhalla Used to track Project Valhalla related work
Projects
None yet
Development

No branches or pull requests

3 participants