-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SapModel caches the Hessian factorization #21421
SapModel caches the Hessian factorization #21421
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves standalone unit tests for SapHessianFactorization
, but I figured I could assing a reviewer to get a first pass before commiting to those tests.
+@sherm1 for feature review please.
Reviewable status: LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers
a1d62c3
to
0c829e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not that it matters, but now SapHessianFactorization
was replace by DenseSuperNodalSolver
.
Reviewable status: LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers
0c829e5
to
1fc1b77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature review checkpoint.
Reviewed 3 of 6 files at r1, all commit messages.
Reviewable status: 2 unresolved discussions, LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 38 at r1 (raw file):
bool DoSetWeightMatrix( const std::vector<Eigen::MatrixXd>& block_diagonal_G) final; Eigen::MatrixXd DoMakeFullMatrix() const final { return H_; }
BTW is there any guarantee H_ has been calculated when this is called? Is there a way to check with a DRAKE_DEMAND or whatever? Also I'm wondering why this returns a copy rather than a reference?
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 52 at r1 (raw file):
MatrixX<double> H_; // N.B. This instantiates an in-place LDLT solver that uses the memory in H_. Eigen::LDLT<MatrixX<double>> Hldlt_{H_};
BTW Eigen's documentation suggests it won't do this in place unless the MatrixType template argument is an EigenRef. Have you checked that this is actually doing what you want it to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature with a few minor comments.
Reviewed 3 of 6 files at r1.
Reviewable status: 13 unresolved discussions, needs at least two assigned reviewers
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 24 at r1 (raw file):
// outlive this object. // @throws std::runtime_error if A or J is nullptr. // @throws std::runtime_error if a block in A is not square.
nit: should just promise std::exception here
multibody/contact_solvers/sap/dense_supernodal_solver.cc
line 42 at r1 (raw file):
const MatrixX<double> Jdense = J_->MakeDenseMatrix(); // Make dense Hessian matrix G.
minor: Hessian -> Weight
multibody/contact_solvers/sap/dense_supernodal_solver.cc
line 54 at r1 (raw file):
Gdense.block(offset, offset, ni, ni) = Gi; offset += ni; }
BTW if G doesn't have enough rows it will be left with zeroes to fill in the rest. Is that OK, or is that also a kind of incompatibility with the Jacobian?
multibody/contact_solvers/sap/sap_model.h
line 123 at r1 (raw file):
// SuperNodalSolver, with the underlying type specified by // SapHessianFactorizationType. class HessianFactorizationCache {
BTW this is a "cache entry" rather than a "cache". Probably should be renamed HessianFactorizationCacheEntry to make that clear.
multibody/contact_solvers/sap/sap_model.h
line 132 at r1 (raw file):
HessianFactorizationCache& operator=(HessianFactorizationCache&&) = default; // Creates an empty cache, with no heap allocation. This is used to delay the
nit: cache -> cache entry
multibody/contact_solvers/sap/sap_model.h
line 145 at r1 (raw file):
// // @warning This is a potentially expensive constructor, performing the // necessary symbolic analysys for the case of sparse factorizations.
typo: analysys -> analysis
multibody/contact_solvers/sap/sap_model.h
line 179 at r1 (raw file):
// avoided at all costs. Therefore this implementation always throws. We can // do this because in the well controlled lifespan of a context generated by a // SapModel, we know that a HessianFactorizationCache is default construted
typo: construted -> constructed
BTW great comment, thanks!
multibody/contact_solvers/sap/sap_model.h
line 410 at r1 (raw file):
// // @note This is an expensive computation and therefore only T = double is // supported.
BTW not clear why being expensive means only T=double (we have other expensive things that can be Autodiffed). Can you elaborate?
multibody/contact_solvers/sap/sap_model.h
line 412 at r1 (raw file):
// supported. // // @trows std::runtime_error when attempting to perform the computation on T
typo: trows -> throws
multibody/contact_solvers/sap/sap_model.cc
line 192 at r1 (raw file):
// N.B. The default constructible SapHessianFactorizationCache is empty. Since // the computation of the factorization is costly, we dealy its construction
typo: dealy -> delay
multibody/contact_solvers/sap/sap_model.cc
line 193 at r1 (raw file):
// N.B. The default constructible SapHessianFactorizationCache is empty. Since // the computation of the factorization is costly, we dealy its construction // to the very first time is computed.
typo: time is -> time it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@SeanCurtis-TRI for platform review per rotation, please
Reviewable status: 13 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @sherm1's comments and have added a few of my own.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: 18 unresolved discussions
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 38 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW is there any guarantee H_ has been calculated when this is called? Is there a way to check with a DRAKE_DEMAND or whatever? Also I'm wondering why this returns a copy rather than a reference?
The documentation in supernodal_solver.h
on MakeFullMatrix()
says that it throws if SetWeightMatrix()
hasn't been called. So, the proposed DRAKE_DEMAND
is rolled into the parent class.
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 43 at r1 (raw file):
int DoGetSize() const final { return size(); } const std::vector<MatrixX<double>>* A_{nullptr};
BTW Given that this isn't movable or copyable, A_
and J_
could be const references instead of pointers to const. It eliminates the need to default construct values and guarantees that they are non-null (spelled as is, the line A_ = nullptr;
is still valid.
multibody/contact_solvers/sap/sap_model.h
line 134 at r1 (raw file):
// Creates an empty cache, with no heap allocation. This is used to delay the // creation of the expensive Hessian structure to when it is strictly needed. // After instantantiations with this constructor is_empty() is `true`.
nit: typo
Suggestion:
instantiations
multibody/contact_solvers/sap/sap_model.h
line 147 at r1 (raw file):
// necessary symbolic analysys for the case of sparse factorizations. // // @pre A and J must not be nullptr.
nit: Slight grammar tweak.
Suggestion:
are not
multibody/contact_solvers/sap/sap_model.cc
line 45 at r1 (raw file):
throw std::runtime_error( "Attempting to clone an expensive Hessian factorization."); return nullptr;
nit: Presence of throw
negates the need for return
.
multibody/contact_solvers/sap/sap_model.cc
line 63 at r1 (raw file):
for (int i = 0; i < num_rhs; ++i) { auto rhs_i = rhs->col(i); rhs_i = factorization()->Solve(rhs_i);
BTW It looks ever so slightly odd that this doesn't call factorization()->SolveInPlace()
.
689c928
to
5b180d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review guys!
I added the promissed unit tests on DenseSuperNodalSolver
. I believe that also helps mitiage some of your questionns in the first round of reviews.
Reviewable status: complete! all discussions resolved, LGTM from assignees SeanCurtis-TRI(platform),sherm1(platform)
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 38 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW is there any guarantee H_ has been calculated when this is called? Is there a way to check with a DRAKE_DEMAND or whatever? Also I'm wondering why this returns a copy rather than a reference?
added comment.
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 43 at r1 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW Given that this isn't movable or copyable,
A_
andJ_
could be const references instead of pointers to const. It eliminates the need to default construct values and guarantees that they are non-null (spelled as is, the lineA_ = nullptr;
is still valid.
Thanks, I like that.
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 52 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW Eigen's documentation suggests it won't do this in place unless the MatrixType template argument is an EigenRef. Have you checked that this is actually doing what you want it to?
Great catch Sherm! It only works if you explictly spell the Eigen::Ref
in the template argument.
multibody/contact_solvers/sap/dense_supernodal_solver.cc
line 54 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW if G doesn't have enough rows it will be left with zeroes to fill in the rest. Is that OK, or is that also a kind of incompatibility with the Jacobian?
nice, I added a check for this.
multibody/contact_solvers/sap/sap_model.h
line 123 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW this is a "cache entry" rather than a "cache". Probably should be renamed HessianFactorizationCacheEntry to make that clear.
right now I am naming it consistently with the rest of the cache structs in this file.
multibody/contact_solvers/sap/sap_model.h
line 410 at r1 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW not clear why being expensive means only T=double (we have other expensive things that can be Autodiffed). Can you elaborate?
added comment.
multibody/contact_solvers/sap/sap_model.cc
line 45 at r1 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: Presence of
throw
negates the need forreturn
.
I sometimes forget how smart the compiler is.
multibody/contact_solvers/sap/sap_model.cc
line 63 at r1 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW It looks ever so slightly odd that this doesn't call
factorization()->SolveInPlace()
.
I agree. Added note.
07fcb79
to
d303c75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: 7 unresolved discussions
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 32 at r1 (raw file):
// @returns the size of the system. int size() const { return H_.rows(); }
nit: Why this method when the parent class already has GetSize()
? It really just feels like the implementation of DoGetSize()
. But adding a redundant public API when the inherited API already includes something doesn't seem like a good idea.
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 48 at r3 (raw file):
int DoGetSize() const final { return size(); } static int CalcSize(const std::vector<MatrixX<double>>& A) {
nit: this function does not appear to be invoked.
multibody/contact_solvers/sap/sap_model.cc
line 63 at r3 (raw file):
auto rhs_i = rhs->col(i); // N.B. Unfortunately SolveInPlace() only accepts VectorXd*, while here the // return of col() is a block object. Therefore, we are forced
nit: Looks like your spacebar ran wild.
Suggestion:
we are forced
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 81 at r3 (raw file):
DRAKE_DEMAND(block_positions.size() == dense_positions.size()); DRAKE_DEMAND(block_positions.size() == block_sizes.size()); const int num_blocks = block_positions.size();
nit: I'm surprised you're not getting a signed-unsigned mismatch.
Suggestion:
const int num_blocks = ssize(block_positions);
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 121 at r3 (raw file):
BlockSparseMatrixBuilder<double> builder(num_row_blocks_of_J, num_col_blocks_of_J, 3); for (auto& block : Jtriplets) {
nit: You're not actually using its mutable nature, so better not to imply such.
Suggestion:
for (const auto& block
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 129 at r3 (raw file):
DenseSuperNodalSolver solver(&blocks_of_A, &J); EXPECT_EQ(solver.GetSize(), 6);
BTW As to my earlier point about the two size APIs, this part of the test makes it look particularly silly.
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 223 at r3 (raw file):
} // Calling Solve() before Factor() throws.
nit: This doesn't feel like a test of DenseSuperNodalSolver
. The dut doesn't take responsibility for this throw, that's part of the parent class's responsibilities.
Same applies to order of operations on MakeFullMatrix()
and Factor()
below.
d303c75
to
4e8a622
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all discussions resolved, LGTM from assignees SeanCurtis-TRI(platform),sherm1(platform)
multibody/contact_solvers/sap/dense_supernodal_solver.h
line 48 at r3 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: this function does not appear to be invoked.
good catch, thanks
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 81 at r3 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: I'm surprised you're not getting a signed-unsigned mismatch.
I've sen these mismatch errors when comparing an int to the result of size().
I belive the assignment always works.
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 223 at r3 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: This doesn't feel like a test of
DenseSuperNodalSolver
. The dut doesn't take responsibility for this throw, that's part of the parent class's responsibilities.Same applies to order of operations on
MakeFullMatrix()
andFactor()
below.
I agree. But it seemed like with very little you get a nice coverage of all cases making this very self contained.
Do you think it's a defect and I should remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r4, all commit messages.
Reviewable status: complete! all discussions resolved, LGTM from assignees SeanCurtis-TRI(platform),sherm1(platform)
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 81 at r3 (raw file):
Previously, amcastro-tri (Alejandro Castro) wrote…
I've sen these mismatch errors when comparing an int to the result of size().
I belive the assignment always works.
Nice to have things explicit. :) Thanks.
multibody/contact_solvers/sap/test/dense_supernodal_solver_test.cc
line 223 at r3 (raw file):
Previously, amcastro-tri (Alejandro Castro) wrote…
I agree. But it seemed like with very little you get a nice coverage of all cases making this very self contained.
Do you think it's a defect and I should remove them?
I can see the appeal behind the argument.
The derived class relies on behavior documented by the parent class to be correct. To make sure the derived class doesn't accidentally get broken if someone changes the parent class, we can put in a couple of regression tests so we know if such a thing happens.
However, I consider that to be death by a thousand cuts. I'd put my reliance on the review process -- if a documented invariant gets changed on a virtual interface in a PR, the reviewers should notice if there is no downstream accommodation or not. The likelihood and frequency of such a change is so small/low that it doesn't seem to be worth bulking up the unit tests for it, even by such a small amount.
And, in a unit test, it misleadingly implies responsibilities where they don't lie.
So, I'd argue that there's no real upside to keep these tests on the parent's implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Reviewed 2 of 6 files at r2, 4 of 4 files at r4, all commit messages.
Reviewable status: complete! all discussions resolved, LGTM from assignees SeanCurtis-TRI(platform),sherm1(platform)
Towards the support of gradients through SAP, see #21370.
We need the Hessian to propagate gradients. The best way to ensure it's up-to-date, is by caching it in the model's context as we do with every other quantity that depends on v.
As a quick reminder, everything SAP computes is a function of generalized velocities v. Thefore the "state" of the SAP model is generalized velocities v stored in the context. Every other quantity, say impulses, cost, gradients and even Hessian, are functions of v that we cache in the context.
This change is