Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: vectors should be demoted to the lowest dimension #413

Merged
merged 6 commits into from
Jan 30, 2024

Conversation

Saransh-cpp
Copy link
Member

@Saransh-cpp Saransh-cpp commented Jan 24, 2024

Description

Fixes #412

Output of the snippet attached in the issue:

a+b
[{x: 0, y: -20, z: 10}, {x: 40, y: 0, z: 0}, {x: 0, y: 60, z: 30}]
{'__record__': 'Momentum3D'}
b+a
[{x: 0, y: -20, z: 10}, {x: 40, y: 0, z: 0}, {x: 0, y: 60, z: 30}]
{'__record__': 'Momentum3D'}

Checklist

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't any other open Pull Requests for the required change?
  • Does your submission pass pre-commit? ($ pre-commit run --all-files or $ nox -s lint)
  • Does your submission pass tests? ($ pytest or $ nox -s tests)
  • Does the documentation build with your changes? ($ cd docs; make clean; make html or $ nox -s docs)
  • Does your submission pass the doctests? ($ xdoctest ./src/vector or $ nox -s doctests)

Before Merging

  • Summarize the commit messages into a brief review of the Pull request.

@Saransh-cpp Saransh-cpp marked this pull request as draft January 24, 2024 20:59
@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (c4c90b9) 82.26% compared to head (1aaa2d1) 82.70%.

Files Patch % Lines
src/vector/backends/awkward.py 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #413      +/-   ##
==========================================
+ Coverage   82.26%   82.70%   +0.43%     
==========================================
  Files          96       96              
  Lines       11429    11453      +24     
==========================================
+ Hits         9402     9472      +70     
+ Misses       2027     1981      -46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Saransh-cpp
Copy link
Member Author

@nsmith- could you check if this is working well? The new tests and the old tests are passing.

The code is a bit dirty; I will refactor it tomorrow and make this ready for review.

@Saransh-cpp
Copy link
Member Author

I think this should be considered a breaking change in vector. @henryiii, how should the version be incremented once this is in? Should it go to v2.0.0 or just v1.2.0?

@Saransh-cpp Saransh-cpp marked this pull request as ready for review January 25, 2024 13:51
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right thing to do and I read over the (passing) tests, and I think they cover all of the cases.

@nsmith-
Copy link
Member

nsmith- commented Jan 25, 2024

I checked out the branch, things look ok interactively, and the problematic coffea tests pass without the explicit cast now. (though some other tests fail, but I think this is on coffea side... checking)

@nsmith-
Copy link
Member

nsmith- commented Jan 25, 2024

Actually no, there is some issue now with the fact that we are subclassing the vector array method mixins:

>       boosted = a.boost(-a.boostvec)

tests/test_nanoevents_vector.py:313:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.env/lib/python3.10/site-packages/vector/_methods.py:3578: in boost
    return boost_beta3.dispatch(self, booster)
.env/lib/python3.10/site-packages/vector/_compute/lorentz/boost_beta3.py:393: in dispatch
    return _handler_of(v1, v2)._wrap_result(
.env/lib/python3.10/site-packages/vector/_methods.py:4192: in _handler_of
    handler = _demote_handler_vector(
.env/lib/python3.10/site-packages/vector/_methods.py:4144: in _demote_handler_vector
    if len({_handler_priority.index(obj.__module__) for obj in objects}) != 1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

.0 = <tuple_iterator object at 0x12d673e20>

>   if len({_handler_priority.index(obj.__module__) for obj in objects}) != 1:
E   ValueError: 'coffea.nanoevents.methods.vector' is not in list

.env/lib/python3.10/site-packages/vector/_methods.py:4144: ValueError

(the .boostvec property is just calling .to_beta3())

Copy link
Member Author

@Saransh-cpp Saransh-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nsmith- could you please look at this again? I tested this locally with coffea's vector deprecation branch and all the tests worked for me.

Comment on lines +4144 to +4151
backends = [
next(
x.__module__
for x in type(obj).__mro__
if "vector.backends." in x.__module__
)
for obj in objects
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now generalises to classes inheriting vector mixins.

Comment on lines +580 to +592
# respect the type of classes inheriting VectorAwkward classes
is_vector = "vector.backends" in cls.__module__
if issubclass(cls, Momentum):
if issubclass(cls, Vector2D):
return "Momentum2D"
return "Momentum2D" if is_vector else cls.__name__[:-5]
if issubclass(cls, Vector3D):
return "Momentum3D"
return "Momentum3D" if is_vector else cls.__name__[:-5]
if issubclass(cls, Vector4D):
return "Momentum4D"
return "Momentum4D" if is_vector else cls.__name__[:-5]
if issubclass(cls, Vector2D):
return "Vector2D"
return "Vector2D" if is_vector else cls.__name__[:-5]
if issubclass(cls, Vector3D):
return "Vector3D"
return "Vector3D" if is_vector else cls.__name__[:-5]
if issubclass(cls, Vector4D):
return "Vector4D"
return "Vector4D" if is_vector else cls.__name__[:-5]
Copy link
Member Author

@Saransh-cpp Saransh-cpp Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was always returning Vector classes; hence, performing vector methods on subclasses of vector mixins always returned a scikit-hep vector -

In [5]: a = ak.zip(
   ...:             {
   ...:                 "pt": [10.0, 20.0, 30.0],
   ...:                 "eta": [0.0, 1.1, 2.2],
   ...:                 "phi": [0.1, 0.9, -1.1],
   ...:                 "mass": [1.0, 1.0, 1.0],
   ...:             },
   ...:             with_name="PtEtaPhiMLorentzVector",
   ...:             behavior=vector.behavior,
   ...:         )

In [6]: a
Out[6]: <PtEtaPhiMLorentzVectorArray [{pt: 10, eta: 0, phi: 0.1, ...}, ...] type='3...'>

In [7]: a + a
Out[7]: <LorentzVectorArray [{x: 19.9, y: 2, z: 0, ...}, ..., {...}] type='3 * Lore...'>

In [8]: a - a
Out[8]: <LorentzVectorArray [{x: 0, y: 0, z: 0, t: 0}, ..., {...}] type='3 * Lorent...'>

In [9]: a.to_Vector4D()
Out[9]: <PtEtaPhiMLorentzVectorArray [{pt: 10, eta: 0, phi: 0.1, ...}, ...] type='3...'>

In [10]: a.to_Vector3D()
Out[10]: <MomentumArray3D [{rho: 10, phi: 0.1, ...}, ..., {...}] type='3 * Momentum3...'>

In [11]: a.to_beta3()
Out[11]: <MomentumArray3D [{rho: 0.995, phi: 0.1, ...}, ...] type='3 * Momentum3D[rh...'>

Now, everything works well, but only if a user names the classes as "{anything}Array" -

In [2]: from coffea.nanoevents.methods import vector
   ...: import awkward as ak
   ...: a = ak.zip(
   ...:         {
   ...:             "x": [[1, 2], [], [3], [4]],
   ...:             "y": [[5, 6], [], [7], [8]],
   ...:             "z": [[9, 10], [], [11], [12]],
   ...:             "t": [[50, 51], [], [52], [53]],
   ...:         },
   ...:         with_name="LorentzVector",
   ...:         behavior=vector.behavior,
   ...:     )
   ...: a.boost(-a.boostvec)
Out[2]: <LorentzVectorArray [[{x: 0, y: -8.88e-16, ...}, ...], ...] type='4 * var *...'>

In [3]: a + a
Out[3]: <LorentzVectorArray [[{x: 2, y: 10, z: 18, ...}, ...], ...] type='4 * var *...'>

In [4]: a.to_Vector2D()
Out[4]: <TwoVectorArray [[{x: 1, y: 5}, {...}], ..., [{...}]] type='4 * var * TwoVe...'>

In [5]: a.to_Vector3D()
Out[5]: <ThreeVectorArray [[{x: 1, y: 5, z: 9}, {...}], ...] type='4 * var * ThreeV...'>

Maybe I should mention this somewhere in the docs.

@henryiii
Copy link
Member

IMO, I'd be fine with calling this v1.2.0, since to me this is a bug fix - a rather impactful one, but still a bug fix. Before it would make a Momentum4D vector with only three dimensions, now the dimensions line up.

But also feel free to do what seems best to you.

@lgray
Copy link

lgray commented Jan 26, 2024

Ah - this is great! Just seeing the PR now. :-)

@nsmith-
Copy link
Member

nsmith- commented Jan 26, 2024

There is still an issue with (a-b) vs. -(b-a):

import awkward as ak
from coffea.nanoevents.methods import vector

a = ak.zip(
    {
        "x": [10.0, 20.0, 30.0],
        "y": [-10.0, 20.0, 30.0],
        "z": [5.0, 10.0, 15.0],
        "t": [16.0, 31.0, 46.0],
    },
    with_name="LorentzVector",
    behavior=vector.behavior,
)
b = ak.zip(
    {
        "x": [-10.0, 20.0, -30.0],
        "y": [-10.0, -20.0, 30.0],
        "z": [5.0, -10.0, 15.0],
    },
    with_name="ThreeVector",
    behavior=vector.behavior,
)
c = ak.zip(
    {"rho": [-10.0, 13.0, 15.0], "phi": [1.22, -1.0, 1.0]},
    with_name="TwoVector",
    behavior=vector.behavior,
)

print("a-b")
print(a-b)
print(ak.parameters(a-b))
print("-(b-a)")
print(-(b-a))
print(ak.parameters(-(b-a)))

produces

a-b
[{x: 20, y: 0, z: 0, t: 16}, {x: 0, ...}, {x: 60, y: 0, z: 0, t: 46}]
{'__record__': 'LorentzVector'}
-(b-a)
[{x: 20, y: -0, z: -0}, {x: -0, y: 40, z: 20}, {x: 60, y: -0, z: -0}]
{'__record__': 'ThreeVector'}

They have a type matching their content, but the type demotion isn't working (there should be no t in the first case)
Though a+b and b+a seem to be working OK.
edit: no, a+b and b+a are both returning LorentzVectorArray type

@Saransh-cpp
Copy link
Member Author

I think this is on the coffea side (on the use_scikithep_vector branch)? I might be wrong -

https://github.com/CoffeaTeam/coffea/blob/894496fc9fe8d3be79263bf7b03e72c18aaf13de/src/coffea/nanoevents/methods/vector.py#L507-L545

@nsmith-
Copy link
Member

nsmith- commented Jan 29, 2024

@Saransh-cpp you're right, it's to do with how we are building the subclasses' behavior dispatch tables. I've made a fix that now passes tests on the coffea PR with this PR.

@lgray
Copy link

lgray commented Jan 30, 2024

Nice - Since we're deprecating nanoevents.vector in this next release of coffea. I think a reasonable timeline for getting a new version of vector with this is and merging the corresponding coffea PR is the beginning of March. Any faster than that and people may get spooked.

@lgray
Copy link

lgray commented Jan 30, 2024

@Saransh-cpp concerning the version number for the next release. This isn't a major design-changing reimplementation of the package, but it does change behavior significantly, so 1.2.0 makes since IMO.

@Saransh-cpp Saransh-cpp changed the title feat: vectors should be demoted to the lowest dimension fix: vectors should be demoted to the lowest dimension Jan 30, 2024
@Saransh-cpp
Copy link
Member Author

Thanks for the reviews! Given that we don't have any release schedule for vector, I'll just merge this in and create a new release.

@Saransh-cpp Saransh-cpp merged commit 6aef998 into scikit-hep:main Jan 30, 2024
21 checks passed
@Saransh-cpp Saransh-cpp deleted the new-demotion-rules branch January 30, 2024 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Record type is not correct for downcasting awkward vectors
6 participants