Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MOI Hessian evaluation is slow with reduced-space predictors #102

Closed
Robbybp opened this issue Sep 4, 2024 · 6 comments
Closed

MOI Hessian evaluation is slow with reduced-space predictors #102

Robbybp opened this issue Sep 4, 2024 · 6 comments

Comments

@Robbybp
Copy link
Collaborator

Robbybp commented Sep 4, 2024

Probably due to the lack of common subexpressions, i.e. jump-dev/JuMP.jl#3738.

For example:

using JuMP
import MathOptAI as MOAI
import MathOptInterface as MOI

function make_model(;dimen = 100, reduced_space = false)
    model = Model()

    input_dimen = 100
    hidden1_dimen = dimen
    hidden2_dimen = dimen
    output_dimen = 10

    @variable(model, x[1:100] >= 0)
    @objective(model, Min, sum(x.^2))

    A1 = ones(hidden1_dimen, input_dimen)
    b1 = ones(hidden1_dimen)
    A2 = ones(hidden2_dimen, hidden1_dimen)
    b2 = ones(hidden2_dimen)
    A3 = ones(output_dimen, hidden2_dimen)
    b3 = ones(output_dimen)

    f = MOAI.Pipeline(
        MOAI.Affine(-A1, b1),
        MOAI.Sigmoid(),
        MOAI.Affine(A2, b2),
        MOAI.Sigmoid(),
        MOAI.Affine(-A3, b3),
    )
    if reduced_space
        y = MOAI.add_predictor(model, MOAI.ReducedSpace(f), x)
    else
        y = MOAI.add_predictor(model, f, x)
    end

    @constraint(model, -75.0 .<= y .<= 75.0)

    return model
end

function time_hessian_eval(model)
    nlmod = MOI.Nonlinear.Model()
    cons = []
    for con in JuMP.all_constraints(model, include_variable_in_set_constraints = true)
        conobj = JuMP.constraint_object(con)
        MOI.Nonlinear.add_constraint(nlmod, conobj.func, conobj.set)
        push!(cons, con)
    end

    vars = JuMP.index.(JuMP.all_variables(model))
    evaluator = MOI.Nonlinear.Evaluator(nlmod, MOI.Nonlinear.SparseReverseMode(), vars)
    MOI.initialize(evaluator, [:Hess])

    hessian_structure = MOI.hessian_lagrangian_structure(evaluator)
    primals = ones(length(vars))
    duals = ones(length(cons))
    vals = zeros(length(hessian_structure))
    MOI.eval_hessian_lagrangian(evaluator, vals, primals, 1.0, duals)

    t0 = time()
    @time MOI.eval_hessian_lagrangian(evaluator, vals, primals, 1.0, duals)
    t_hess_vals = time() - t0

    return t_hess_vals
end

m_full = make_model(reduced_space = false)
m_reduced = make_model(reduced_space = true)

println("Full-space")
t_full_space_hess = time_hessian_eval(m_full)
println("Reduced-space")
t_reduced_space_hess = time_hessian_eval(m_reduced)

I get:

Full-space
  0.000036 seconds
Reduced-space
  1.663424 seconds (29 allocations: 464 bytes)

Feel free to close this as this is a known issue, but just wanted to document that I've been hitting this bottleneck.

@Robbybp Robbybp changed the title JuMP Hessian evaluation is slow with reduced-space predictors MOI Hessian evaluation is slow with reduced-space predictors Sep 4, 2024
@odow
Copy link
Collaborator

odow commented Sep 4, 2024

Yip. This doesn't surprise me in the slightest. Try AMPL via AmplNLWriter? (And shouldn't you be going to the airport soon?)

@odow
Copy link
Collaborator

odow commented Sep 4, 2024

I think the main problem is that we scalarize Ax + b, so dense matrices generate a lot of terms in the tape.

And in this case, the Hessian is dense.

But it's a good example for me to profile and understand, so let's leave this open.

@odow
Copy link
Collaborator

odow commented Sep 4, 2024

using JuMP
import MathOptAI as MOAI

function make_model(d = 100)
    model = Model()
    output_d = 10
    @variable(model, x[1:d] >= 0)
    @objective(model, Min, sum(x.^2))
    f = MOAI.Pipeline(
        MOAI.Affine(-ones(d, d), ones(d)),
        MOAI.Sigmoid(),
        MOAI.Affine(ones(d, d), ones(d)),
        MOAI.Sigmoid(),
        MOAI.Affine(-ones(output_d, d), ones(output_d)),
    )
    y = MOAI.add_predictor(model, MOAI.ReducedSpace(f), x)
    @constraint(model, -75.0 .<= y .<= 75.0)
    nl_model = MOI.Nonlinear.Model()
    n_cons = 0
    for con in JuMP.all_constraints(
        model;
        include_variable_in_set_constraints = false,
    )
        o = JuMP.constraint_object(con)
        MOI.Nonlinear.add_constraint(nl_model, o.func, o.set)
        n_cons += 1
    end
    variables = index.(all_variables(model))
    evaluator = MOI.Nonlinear.Evaluator(
        nl_model,
        MOI.Nonlinear.SparseReverseMode(),
        variables,
    )
    MOI.initialize(evaluator, [:Hess])
    hessian_structure = MOI.hessian_lagrangian_structure(evaluator)
    H = zeros(length(hessian_structure))
    x = ones(length(variables))
    σ = 1.0
    μ = ones(n_cons)
    function profiler()
        x = rand(length(variables))
        MOI.eval_hessian_lagrangian(evaluator, H, x, σ, μ)
        return H
    end
    return profiler
end

profiler = make_model()
@time profiler();
using ProfileView
@profview profiler();

image

Nothing immediately jumps (if you will) out as a bottleneck. So I think this is the algorithm working as expected. It just doesn't like this particular example because of the A * x and the dense Hessian.

@Robbybp
Copy link
Collaborator Author

Robbybp commented Sep 4, 2024

Try AMPL via AmplNLWriter?

Good idea, will do. We exploit "defined variables" in the .nl writer?

(And shouldn't you be going to the airport soon?)

Waiting for my flight :)

@odow
Copy link
Collaborator

odow commented Sep 4, 2024

We exploit "defined variables" in the .nl writer?

Nope. But they use a different AD algorithm, so it might help.

This is also the reason that I've added the recent GrayBox support. We can treat the full NN as a user-defined function and compute derivatives across the full model---there's no need to represent the internals explicitly at the JuMP level.

@odow
Copy link
Collaborator

odow commented Sep 9, 2024

Closing as won't fix for now. Performance issues are definitely on my radar. Slamming a NN at the reduced-space formulation like this is not ideal.

@odow odow closed this as completed Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants