Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic pcurves for application specific curves #4554

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

randombit
Copy link
Owner

Unlike the existing system which instantiates everything at compile time based on the curve parameters, this allows choosing the parameters at runtime. This can be used for application specific curves, or when it is desirable to not use the dedicated modules for binary size reasons.

Comparing the three approaches (pcurves, pcurves_generic, BigInt) for some common EC operations (keygen, ECDSA sign, ECDSA verify, ECDH agreement). All numbers are operations/second for the brainpool256r1 curve.

impl keygen sign verify agree
pcurves 20770 16191 7410 5598
generic 10208 8367 4013 3077
BigInt 10405 7686 3707 2463

The advantage of the fully compile-time implementation is clear (and could become even stronger if/when compilers improve their constexpr support), but this module has performance at least as good as the old BigInt-based EC and with a significantly improved situation regarding side channels.

This implementation does not support curves which are cursed:

  • Curves with a order of different bitlength than the field
  • Curves with primes not congruent to 3 modulo 4
  • Curves with a cofactor

One notable cursed curve is P-224 (due to p == 1 mod 4), which is only supported if the dedicated pcurves_secp224r1 module is included.

@randombit randombit requested a review from reneme January 15, 2025 09:31
@randombit
Copy link
Owner Author

randombit commented Jan 15, 2025

One obvious objection here is that pcurves_generic.cpp and pcurves_impl.h have very similar code and indeed the generic version is derived from that header. I could not find good ways of consolidating the code without compromising the efficiency of the _impl.h approach; for example we have to use vectors (vs std::array) a lot more in the generic case both because the sizes of various types are - while bounded - mostly unknown at compile time, and we have to initialize the pointer to the parameters in each object which prevents there from being a viable default constructor. Totally open to suggestions here (as long as they don't impact performance)

I'm inclined to accept it as the cost of supporting application specific/weirdo curves, with the goal firmly in mind that Botan4 removes the third 😂 generic EC implementation currently in legacy_ec_point.

@reneme I know you were concerned about performance of this approach vs the existing BigInt for application specific curves; numbers above seem to be representative IME so far but you might want to check for yourself with your parameters. Easiest way to test is to build twice, once with --disable-modules=legacy_ec_point and once with --disable-modules=pcurves_generic. (You could also just tweak the if statement in ec_inner_data.cpp that calls from_params so as to not have to recompile the whole library)

@randombit randombit force-pushed the jack/pcurves-generic branch 3 times, most recently from 7b5e5ff to a4dd13e Compare January 15, 2025 10:20
Unlike the existing system which instantiates everything at compile time based
on the curve parameters, this allows choosing the parameters at runtime. This
can be used for application specific curves, or when it is desirable to not use
the dedicated modules for binary size reasons.

Comparing the three approaches (pcurves, pcurves_generic, BigInt) for some
common EC operations (keygen, ECDSA sign, ECDSA verify, ECDH agreement).  All
numbers are operations/second for the brainpool256r1 curve.

| impl     | keygen   | sign     | verify   | agree    |
| -------- | -------- | -------- | -------- | -------- |
| pcurves  | 20770    | 16191    | 7410     | 5598     |
| generic  | 10208    | 8367     | 4013     | 3077     |
| BigInt   | 10405    | 7686     | 3707     | 2463     |

The advantage of the fully compile-time implementation is clear (and could
become even stronger if/when compilers improve their constexpr support), but
this module has performance at least as good as the old BigInt-based EC and with
a significantly improved situation regarding side channels.

This implementation does not support curves which are cursed:

* Curves with a order of different bitlength than the field
* Curves with primes not congruent to 3 modulo 4
* Curves with a cofactor

One notable cursed curve is P-224 (due to p == 1 mod 4), which is only
supported if the dedicated pcurves_secp224r1 module is included.
@coveralls
Copy link

coveralls commented Jan 15, 2025

Coverage Status

coverage: 91.231% (+0.03%) from 91.197%
when pulling ddc8521 on jack/pcurves-generic
into 1afcdf2 on master.

@randombit
Copy link
Owner Author

BTW @reneme possibly of interest here, since you mentioned using compressed points in certain situations - point decompression is ~ 10 to 30 times faster faster with pcurves_generic vs BigInt. Interestingly these are implemented exactly the same way, exponentiation to (p+1)/4 in a Montgomery representation using a fixed window modexp, so I'm not sure why the difference is that large.

Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No full review, just a bunch of comments from a quick first read.

I could not find good ways of consolidating the code without compromising the efficiency of the _impl.h approach

That's really unfortunate, I'll keep an eye out. But in my experience its fairly hard to generalize compile-time and runtime length information without compromising performance.

src/lib/math/pcurves/pcurves_generic/pcurves_generic.cpp Outdated Show resolved Hide resolved
src/lib/math/pcurves/pcurves_generic/pcurves_generic.cpp Outdated Show resolved Hide resolved
BOTAN_ASSERT_NOMSG(n_words <= PrimeOrderCurve::StorageWords);

std::array<word, PrimeOrderCurve::StorageWords> r = {};
copy_mem(r.data(), n._data(), n_words);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon my kink with trying to avoid raw pointers: Perhaps add BigInt::_as_span() -> std::span<const word> (or similarly named) and then go along those lines:

copy_mem(r, n._as_span().first(n_words));

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon my kink with trying to avoid raw pointers

😂

This is entirely reasonable especially for if/when mp becomes more span-centric.

src/lib/math/pcurves/pcurves_generic/pcurves_generic.cpp Outdated Show resolved Hide resolved
src/lib/math/pcurves/pcurves_generic/pcurves_generic.cpp Outdated Show resolved Hide resolved
src/lib/math/pcurves/pcurves_generic/pcurves_generic.cpp Outdated Show resolved Hide resolved
@randombit
Copy link
Owner Author

Performance results on another machine (Zen2 Linux laptop, Clang 19) showing quite a nice bump here. So clearly this depends on uarch a bit.

pcurves_brainpool256r1

ECDSA-brainpool256r1 16110 sign/sec; 0.06 ms/op 147229 cycles/op (8056 ops in 500 ms)
ECDSA-brainpool256r1 7405 verify/sec; 0.14 ms/op 320125 cycles/op (3704 ops in 500 ms)
ECDH-brainpool256r1 5585 key agreements/sec; 0.18 ms/op 424426 cycles/op (2794 ops in 500 ms)

pcurves_generic

ECDSA-brainpool256r1 10457 sign/sec; 0.10 ms/op 226743 cycles/op (5229 ops in 500 ms)
ECDSA-brainpool256r1 4941 verify/sec; 0.20 ms/op 479724 cycles/op (2472 ops in 500 ms)
ECDH-brainpool256r1 3719 key agreements/sec; 0.27 ms/op 637269 cycles/op (1860 ops in 500 ms)

legacy_ec_point

ECDSA-brainpool256r1 7743 sign/sec; 0.13 ms/op 306221 cycles/op (3872 ops in 500 ms)
ECDSA-brainpool256r1 3794 verify/sec; 0.26 ms/op 624803 cycles/op (1898 ops in 500 ms)
ECDH-brainpool256r1 2537 key agreements/sec; 0.39 ms/op 934347 cycles/op (1270 ops in 501 ms)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants