My modification to the geodesic regression example fails -- but not clear why (at least to me) #427

salbert83 · 2024-12-03T03:21:15Z

salbert83
Dec 3, 2024

My eventual goal is to introduce weights in the optimization. First, I ran the example from
https://manoptjl.org/stable/tutorials/GeodesicRegression/
and checked the gradient calculated at x0,

Y = similar(x0)
@show RegressionCost(data, t)(M, x0)   # got 0.14286244488043887
RegressionGradient!(data, t)(M, Y, x0)

Then, I made this modification to handle weights,

function calc_cost2(a::RegressionCost, M, xw)
    x = xw[M, 1]
    w = xw[M, 2]
    pts = [geodesic(M[1].manifold, x[M[1], :point], x[M[1], :vector], ti) for ti in a.times]
    return 0.5 * sum(w .* distance.(Ref(M[1].manifold), pts, a.data) .^ 2)
end

function calc_gradient2!(a::RegressionGradient!, M, Y, xw)
    x = xw[M, 1]
    w = xw[M, 2]
    pts = [geodesic(M[1].manifold, x[M[1], :point], x[M[1], :vector], ti) for ti in a.times]
    gradients = w .* ManifoldDiff.grad_distance.(Ref(M[1].manifold), a.data, pts)
    Y[M,1][M[1], :point] .= sum(
        ManifoldDiff.adjoint_differential_exp_basepoint.(
            Ref(M[1].manifold),
            Ref(x[M[1], :point]),
            [ti * x[M[1], :vector] for ti in a.times],
            gradients,
        ),
    )
    Y[M,1][M[1], :vector] .= sum(
        ManifoldDiff.adjoint_differential_exp_argument.(
            Ref(M[1].manifold),
            Ref(x[M[1], :point]),
            [ti * x[M[1], :vector] for ti in a.times],
            gradients,
        ),
    )
    Y[M,2][:] .= 0
    return Y
end

I again calculated the initial gradient,

xw0 = ArrayPartition(x0, fill(1/n,n))
M2 = M × ProbabilitySimplex(n-1)
is_point(M2, xw0)                                      # returns true

# Note the scaling by 'n', since the weights are in the probability simplex
Y3 = similar(xw0)
@show calc_cost2(RegressionCost(data, t), M2, xw0) * n     # got 0.14286244488043887 
calc_gradient2!(RegressionGradient!(data, t), M2, Y3, xw0)

norm(n*Y3[M2,1] - Y) / norm(Y)  # got 4.228695411939743e-16

My cost and gradient evaluation are matching. Again, eventually I might optimize over weights as well, but not for now. I try the following for the optimization,

y = gradient_descent(
    M2,
    (M, xw) -> calc_cost2(RegressionCost(data, t), M, xw),
    (M, Y, xw) -> calc_gradient2!(RegressionGradient!(data, t), M, Y, xw),
    xw0;
    evaluation=InplaceEvaluation(),
    stepsize=ArmijoLinesearch(
        M2;
        initial_stepsize=1.0,
        contraction_factor=0.5,
        sufficient_decrease=0.01,
        stop_when_stepsize_less=1e-9,
    ),
    stopping_criterion=StopAfterIteration(200) |
                        StopWhenGradientNormLess(1e-8) |
                        StopWhenStepsizeLess(1e-9),
    debug=[:Iteration, " | ", :Cost, "\n", :Stop, 1],
)

and get

Initial  | f(x): 0.020409
# 1      | f(x): 0.020409
The algorithm computed a step size (0.0) less than 1.0e-9.

Why doesn't it make any progress when the example problem (with same cost and gradient) seems to work.

FYI -- My final optimization will not be over the entire simplex, as there would be many local optima with all weights at 2 points and the geodesic going through those 2. I just want to keep the weights around, either as a parameter or modifiy the objective to penalize non-uniform weights.

Thanks in advance!

Answered by kellertuer

Dec 15, 2024

Concerning the question in the linked code you wrote about why we do the PCA in a basis:
There is no other way to make sense of it.

On the sphere at a point $p^$, we have a tangent space – and if we have data in this space – that is data in a 2D space – but the tangent vectors are all always stored as vectors orthogonal to $p^$ – so something 3D. But only a 2D PCA makes sense. So when we represent the tangent vectors by coordinates in a basis, we get what we need – “2D data”.

For the rest – sorry for the bug. I will probably remove that tutorial in a next release. Once someone fixes that and writes a tutorial that then also includes a gradient check (the tutorial predates the gradient che…

View full answer

kellertuer · 2024-12-03T07:47:27Z

kellertuer
Dec 3, 2024
Maintainer

Hi,
thanks for using Manopt! I took the freedom to format your code in 3 backticks fro readability.
This week I am a bit busy with exams so I can first only give a short hint on the last output: If your second step already yields a zero step, probably your gradient is not correct.

What do you mean by your gradient calculations are “matching”? Have you for example verified the gradient with https://manoptjl.org/stable/helpers/checks/#Manopt.check_gradient?

I can try to find some time on the weekend to take a closer look.

0 replies

salbert83 · 2024-12-06T04:51:48Z

salbert83
Dec 6, 2024
Author

Thanks for the help. Incidentally, I believe in RegressionGradient!, there needs to be

   Y[M, :vector] .= sum(
      a.times .* ManifoldDiff.adjoint_differential_exp_argument.(
         Ref(M.manifold),
         Ref(x[M, :point]),
         [ti * x[M, :vector] for ti in a.times],
         gradients,
      ),
)

and for RegressionGradient2a!,

    Y[TM, :vector] .= sum(
           x[N,2] .* ManifoldDiff.adjoint_differential_exp_argument.(
               Ref(TM.manifold),
               Ref(p[TM, :point]),
               [ti * p[TM, :vector] for ti in x[N, 2]],
               gradients,
           ),
       )

2 replies

kellertuer Dec 6, 2024
Maintainer

Can you maybe put the code in 3 backticks (I did that on your. post above) then it is easier to read.

Sorry the last days I had oral exams all day, so I was only able to fight my amount of incoming emails besides that.

salbert83 Dec 6, 2024
Author

Sure. Thanks for the quick reply.

kellertuer · 2024-12-10T18:18:43Z

kellertuer
Dec 10, 2024
Maintainer

I tried my best today to recreate your example but since this is incremental, it is a bit hard to reproduce your results and problems. Can you maybe post a complete example? That would be great.

because sure – I can spent 30+ minutes to get an example to run, but if I only have 30 minutes available like I had just now – that leaves a negative number of minutes to check the example :/

5 replies

kellertuer Dec 10, 2024
Maintainer

For example even starting with the regression code, I can not run your line

@show calc_cost2(RegressionCost(data, t), M2, xw0) * n     # got 0.14286244488043887

since I am not able to find which t you used.
Then it is strange that you use M[1] in that definition since as long as you use RegressionGradient! that is meant to be defined on the tangent bundle and that does not have components like that.

So for now you consider fixed time points? Because for the second case of optimising over those as well M[1] might show up, since we are then on a product manifold.

And again, as soon as a gradient can be computed, please definetly verify the gradient using https://manoptjl.org/stable/helpers/checks/#Manopt.check_gradient (for best of cases with Plots.jl to get even a good visual answer)

salbert83 Dec 14, 2024
Author

The attached text is actually a julia script (just change the .txt to .jl) I believe the issue I was having was related to some typos in the link. See the attached. Thanks!
GeodesicRegression.txt

kellertuer Dec 15, 2024
Maintainer

So I ran your code – you might be right, the gradient has maybe a bug. That is interesting – but I struggle bit narrowing it down. The simple reason is that this code is about 10 years old (back then programme in Matlab, not even by myself, but transferred by myself to Julia into this tutorial – so all bugs are for sure my fault)). So the only thing I can offer is to remove the example and the functions provided in order to not have buggy examples in the tutorials.

Once someone finds out what the reason is, this example should anyways be added to ManoptExamples.

Sorry for that. Maybe consider using a software that is less buggy then.
And yes, currently I am maybe also a bit demotivated – because the main feedback I do get is guys and things not working.

salbert83 Dec 15, 2024
Author

Hi. In my personal experience, this is far less buggy than pymanopt. I am avoiding the Matlab repository because I find Matlab deficient compared to the Julia ecosystem. Regarding the bug in geodesic regression, I demonstrated a fix in the struct RegressionGradientFixed! (exclamation point for a mutating function and Julia conventions, not emphasis :-) I wasn't too sure how many people are actively building JuliaManifolds, otherwise I would have requested a pull request with the fix. I was under the impression that this is being actively developed, and a PR might been for a moving target. Please don't give up. This is great work!

kellertuer Dec 15, 2024
Maintainer

Hm, I have not yet understood your fix, because it is hard to digest – it is a large block of code, where I still have to figure out the difference to the existing example.

JuliaManifolds is currently 2 active maintainers, so not that many.
For now I opened a PR to remove the sample from Manopt. Once I find time to work through all your code and understanding the fix, I might rewrite the tutorial and add it to ManoptExamples. I would of course also document the correct formulae (one thing I did not do several years back when I wrote the tutorial initially).

Concerning pymanopt – it gained activity recently, with quite some PRs reworking the backend, but I never used it actively (since I never used Python that actively) so I can not comment on its reliability. I was hoping Manopt is reliable, but I am often not so sure,

But that might take a while, for now my main focus is LieGroups.jl – until then it is usually also helpful to provide error messages (besides reproducible code).
No worries, I will not give up coding on this, maybe just providing so much support as I try to provide now, since that is really tough with such a small team.

kellertuer · 2024-12-15T16:36:10Z

kellertuer
Dec 15, 2024
Maintainer

Concerning the question in the linked code you wrote about why we do the PCA in a basis:
There is no other way to make sense of it.

On the sphere at a point $p^$, we have a tangent space – and if we have data in this space – that is data in a 2D space – but the tangent vectors are all always stored as vectors orthogonal to $p^$ – so something 3D. But only a 2D PCA makes sense. So when we represent the tangent vectors by coordinates in a basis, we get what we need – “2D data”.

For the rest – sorry for the bug. I will probably remove that tutorial in a next release. Once someone fixes that and writes a tutorial that then also includes a gradient check (the tutorial predates the gradient check implementation by several years) – it should be added to ManoptExamples.

2 replies

salbert83 Dec 15, 2024
Author

Yes, so, since the tangent vectors are all in a hyperplane orthogonal to p, so will the principal components (as demonstrated in the code example). I used Grassmann to compare since the vector I got was the negative of the one in the example in the link, which doesn't really matter for PCA. Thank you for the feedback and no worries about bugs. In my country we have an expression that the only people who make mistakes are the ones actually doing anything :-)

mateuszbaran Dec 15, 2024
Maintainer

Yes, technically not using bases explicitly for PCA in many cases would work. It's just not particularly general because in JuliaManifolds tangent vectors are not always represented using plain arrays, so you can't be sure applying standard PCA algorithms works. For a sphere it does though.

Sometimes actually obtaining a basis is computationally expensive and PCA doesn't really need a basis, as you demonstrated with the sphere. I thought about adding more general frames to have fast PCA in cases where getting a basis is expensive but it hasn't reached the top of my priority list yet.

salbert83 · 2024-12-15T19:44:17Z

salbert83
Dec 15, 2024
Author

Great point. Thanks you!

…

Sent from my Galaxy

-------- Original message -------- From: Mateusz Baran ***@***.***> Date: 12/15/24 1:08 PM (GMT-05:00) To: "JuliaManifolds/Manopt.jl" ***@***.***> Cc: salbert83 ***@***.***>, Author ***@***.***> Subject: Re: [JuliaManifolds/Manopt.jl] My modification to the geodesic regression example fails -- but not clear why (at least to me) (Discussion #427) Yes, technically not using bases explicitly for PCA in many cases would work. It's just not particularly general because in JuliaManifolds tangent vectors are not always represented using plain arrays, so you can't be sure applying standard PCA algorithms works. For a sphere it does though. Sometimes actually obtaining a basis is computationally expensive and PCA doesn't really need a basis, as you demonstrated with the sphere. I thought about adding more general frames<https://en.wikipedia.org/wiki/Frame_(linear_algebra)> to have fast PCA in cases where getting a basis is expensive but it hasn't reached the top of my priority list yet. — Reply to this email directly, view it on GitHub<#427 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFPIHY5JHWKZ4FEC2W73F7D2FXARVAVCNFSM6AAAAABS4ZCLVCVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNJXGM2TKNY>. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

My modification to the geodesic regression example fails -- but not clear why (at least to me) #427

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 9 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

My modification to the geodesic regression example fails -- but not clear why (at least to me) #427

salbert83 Dec 3, 2024

Replies: 5 comments · 9 replies

kellertuer Dec 3, 2024 Maintainer

salbert83 Dec 6, 2024 Author

kellertuer Dec 6, 2024 Maintainer

salbert83 Dec 6, 2024 Author

kellertuer Dec 10, 2024 Maintainer

kellertuer Dec 10, 2024 Maintainer

salbert83 Dec 14, 2024 Author

kellertuer Dec 15, 2024 Maintainer

salbert83 Dec 15, 2024 Author

kellertuer Dec 15, 2024 Maintainer

kellertuer Dec 15, 2024 Maintainer

salbert83 Dec 15, 2024 Author

mateuszbaran Dec 15, 2024 Maintainer

salbert83 Dec 15, 2024 Author

salbert83
Dec 3, 2024

Replies: 5 comments 9 replies

kellertuer
Dec 3, 2024
Maintainer

salbert83
Dec 6, 2024
Author

kellertuer Dec 6, 2024
Maintainer

salbert83 Dec 6, 2024
Author

kellertuer
Dec 10, 2024
Maintainer

kellertuer Dec 10, 2024
Maintainer

salbert83 Dec 14, 2024
Author

kellertuer Dec 15, 2024
Maintainer

salbert83 Dec 15, 2024
Author

kellertuer Dec 15, 2024
Maintainer

kellertuer
Dec 15, 2024
Maintainer

salbert83 Dec 15, 2024
Author

mateuszbaran Dec 15, 2024
Maintainer

salbert83
Dec 15, 2024
Author