-
-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce basic support for Vector API #2890
Conversation
Short rather than int? |
In its current state, it would be difficult to use |
(Also vector operations are probably one of the few places where element size actually has an impact on performance, so keeping it as small as possible makes sense) |
Yeah I suppose a lot of this would be best duplicated as the data array PR does so we can maintain the smaller size "optimisation" where possible Also looks like some changes are needed to the CI/PR run config |
Yes, but I intend to leave it like this for now, as it is experimental anyway.
That should be fixed now, the javadoc task was failing as it requires the |
worldedit-core/src/main/java/com/fastasyncworldedit/core/internal/simd/VectorizedMask.java
Show resolved
Hide resolved
worldedit-core/src/main/java/com/fastasyncworldedit/core/internal/simd/SimdSupport.java
Show resolved
Hide resolved
...core/src/main/java/com/fastasyncworldedit/core/queue/implementation/ParallelQueueExtent.java
Show resolved
Hide resolved
Please take a moment and address the merge conflicts of your pull request. Thanks! |
9b25f64
to
52c0c62
Compare
Overview
Description
This is an experimental, internal feature to use the Vector API when enabled. The current implementation is designed to cover basic use cases (i.e.
//gmask
with single blocks,//set
with single blocks,//replace
with single blocks). This could be expanded, but as that needs to be done rather careful to not make performance worse, I left it to the very basics for now.A quick example of running
//replace dirt diamond_block
on a selection of 2048x320x2048:Before:
Note that the
hasSection
calls take up the same time in both profiles, same forinitLayer
. The difference is in thefilter
call, which results in many additional calls in the none-specialized variant vs pretty much only memory access and vector instructions in the specialized variant. (There might be ways to optimize without the Vector API here I guess, but the existing abstraction doesn't work well in either cases)A few more notes on the benchmark:
Submitter Checklist