Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace std::pow with integer powers to amrex::Math::powi #1426

Merged
merged 5 commits into from
Dec 31, 2023

Conversation

zhichen3
Copy link
Collaborator

With updated network to reflect changes in pynucastro/pynucastro#709.

Added a script and CI to detect cases where we can potentially use powi instead of std::pow for integer powers.

@zingale
Copy link
Member

zingale commented Dec 22, 2023

@zingale
Copy link
Member

zingale commented Dec 31, 2023

@psharda @BenWibking do you want to look at the primordial chem changes?
the idea here is that using the templated pow for integer powers should be faster than the library call.

@BenWibking
Copy link
Collaborator

@psharda @BenWibking do you want to look at the primordial chem changes? the idea here is that using the templated pow for integer powers should be faster than the library call.

Is this for debug builds?

@zingale
Copy link
Member

zingale commented Dec 31, 2023

for any build. It was pointed out to us that C++ no longer has an integer power variant of pow(). It looks like that was removed with C++11:

https://en.cppreference.com/w/cpp/numeric/math/pow

@BenWibking
Copy link
Collaborator

for any build. It was pointed out to us that C++ no longer has an integer power variant of pow(). It looks like that was removed with C++11:

https://en.cppreference.com/w/cpp/numeric/math/pow

On x86, gcc optimizes it out anyway: https://godbolt.org/z/z3nvW1vqM.

Does this happen for GPU builds? I unfortunately can't parse nvptx to figure out if it's doing this in CUDA.

@BenWibking
Copy link
Collaborator

BenWibking commented Dec 31, 2023

Ah, well. It looks like gcc only optimizes pow(x, 1) and pow(x, 2) to avoid calls to pow. For higher integer powers, it apparently doesn't optimize it out: https://godbolt.org/z/nd1rqxG8b.

For reference, this is also the case for Clang: https://godbolt.org/z/v7Yejv1M8

@maxpkatz
Copy link
Member

In general, NVIDIA will replace std::pow with the device functions powf or pow which only accept float/double for the exponent argument. This will not be optimized away for the integer case (at least in general; there may be special cases, but I don't remember there being any).

@BenWibking
Copy link
Collaborator

@psharda @BenWibking do you want to look at the primordial chem changes? the idea here is that using the templated pow for integer powers should be faster than the library call.

It looks like our unit test passes, so I'm fine with this.

We do have a lot of negative integer powers - just to be sure, does it handle those correctly?

@zingale
Copy link
Member

zingale commented Dec 31, 2023

yes:

 189 //! Return pow(x, Power), where Power is an integer known at compile time                             
 190 template <int Power, typename T,                                                                      
 191     typename = typename std::enable_if<!std::is_integral<T>() || Power>=0>::type>                     
 192 AMREX_FORCE_INLINE                                                                                    
 193 constexpr T powi (T x) noexcept                                                                       
 194 {                                                                                                     
 195     if constexpr (Power < 0) {                                                                        
 196         return T(1)/powi<-Power>(x);                                                                  
 197     } else if constexpr (Power == 0) { ```

@zingale zingale merged commit 32fa759 into AMReX-Astro:development Dec 31, 2023
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants