diff --git a/README.md b/README.md index 60aaa2a..6924930 100644 --- a/README.md +++ b/README.md @@ -70,9 +70,10 @@ Jacobian. This is the most efficient method, because it minimizes the memory allocations. -In the following, it is assumed that have defined a function `f!(x::Vector, -fx::Vector)` computing the residual of the system at point `x` and putting it -into the `fx` argument. +In the following, it is assumed that you have defined a function +`f!(x::AbstractVector, fx::AbstractVector)` or, more generally, +`f!(x::AbstractArray, fx::AbstractArray)` computing the residual of the system +at point `x` and putting it into the `fx` argument. In turn, there 3 ways of specifying how the Jacobian should be computed: @@ -93,6 +94,15 @@ df = DifferentiableMultivariateFunction(f!) nlsolve(df, initial_x) ``` +If your function `f!` operates not on `AbstractVector` but on arbitrary +`AbstractArray` you have to specify `initial_x` in the constructor of the +`DifferentiableMultivariateFunction`: + +```jl +df = DifferentiableMultivariateFunction(f!, initial_x) +nlsolve(df, initial_x) +``` + ### Automatic differentiation Another option if you do not have a function computing the Jacobian is to use @@ -105,14 +115,18 @@ nlsolve(f!, initial_x, autodiff = true) ### Jacobian available -If, in addition to `f!`, you have a function `g!(x::Vector, gx::Array)` for -computing the Jacobian of the system, then the syntax is, as in the example -above: +If, in addition to `f!(x::AbstractVector, fx::AbstractVector)`, you have a +function `g!(x::AbstractVector, gx::AbstractMatrix)` for computing the Jacobian +of the system, then the syntax is, as in the example above: ```jl nlsolve(f!, g!, initial_x) ``` +Again it is also possible to specify two functions `f!(x::AbstractArray, +fx::AbstractArray)` and `g!(x::AbstractArray, gx::AbstractMatrix)` that work on +arbitrary arrays `x`. + Note that you should not assume that the Jacobian `gx` passed in argument is initialized to a zero matrix. You must set all the elements of the matrix in the function `g!`. @@ -125,17 +139,32 @@ df = DifferentiableMultivariateFunction(f!, g!) nlsolve(df, initial_x) ``` +If `f!` and `g!` do not operate on vectors the syntax is: + +```jl +df = DifferentiableMultivariateFunction(f!, g!, initial_x) +nlsolve(df, initial_x) +``` + ### Optimization of simultaneous residuals and Jacobian -If, in addition to `f!` and `g!`, you have a function `fg!(x::Vector, -fx::Vector, gx::Array)` that computes both the residual and the Jacobian at -the same time, you can use the following syntax: +If, in addition to `f!` and `g!`, you have a function `fg!(x::AbstractVector, +fx::AbstractVector, gx::AbstractMatrix)` or `fg!(x::AbstractArray, +fx::AbstractArray, gx::AbstractMatrix)` that computes both the residual and the +Jacobian at the same time, you can use the following syntax for vectors ```jl df = DifferentiableMultivariateFunction(f!, g!, fg!) nlsolve(df, initial_x) ``` +and similarly for arbitrary arrays + +```jl +df = DifferentiableMultivariateFunction(f!, g!, fg!, initial_x) +nlsolve(df, initial_x) +``` + If the function `fg!` uses some optimization that make it costless than calling `f!` and `g!` successively, then this syntax can possibly improve the performance. @@ -147,45 +176,89 @@ cases are not optimal in terms of memory management. If only `f!` and `fg!` are available, the helper function `only_f!_and_fg!` can be used to construct a `DifferentiableMultivariateFunction` object, that can be -used as first argument of `nlsolve`. The complete syntax is therefore: +used as first argument of `nlsolve`. The complete syntax is therefore ```jl nlsolve(only_f!_and_fg!(f!, fg!), initial_x) ``` +if `f!` and `fg!` operate on vectors, and otherwise + +```jl +nlsolve(only_f!_and_fg!(f!, fg!, initial_x), initial_x) +``` + If only `fg!` is available, the helper function `only_fg!` can be used to construct a `DifferentiableMultivariateFunction` object, that can be used as -first argument of `nlsolve`. The complete syntax is therefore: +first argument of `nlsolve`. The complete syntax is therefore ```jl nlsolve(only_fg!(fg!), initial_x) ``` +if `f!` and `fg!` operate on vectors, and otherwise + +```jl +nlsolve(only_fg!(f!, fg!, initial_x), initial_x) +``` + ## With functions returning residuals and Jacobian as output -Here it is assumed that you have a function `f(x::Vector)` that returns a -newly-allocated vector containing the residuals. The helper function -`not_in_place` can be used to construct a `DifferentiableMultivariateFunction` -object, that can be used as first argument of `nlsolve`. The complete syntax is -therefore: +Here it is assumed that you have a function `f(x::AbstractVector)` that returns +a newly-allocated vector containing the residuals. The helper function +`not_in_place` can be used to construct a function, that can be used as first +argument of `nlsolve`. The complete syntax is therefore: ```jl nlsolve(not_in_place(f), initial_x) ``` -Finite-differencing is used to compute the Jacobian in that case. +If `f` does not operate on vectors but on arbitrary arrays the syntax is -If, in addition, there is a function `g(x::Vector)` returning a newly-allocated -matrix containing the Jacobian, ```jl -nlsolve(not_in_place(f), not_in_place(g), initial_x) +nlsolve(not_in_place(f, initial_x), initial_x) +``` + +Via the `autodiff` keyword both finite-differencing and autodifferentiation can +be used to compute the Jacobian in that case. + +If, in addition to `f(x::AbstractVector)`, there is a function +`g(x::AbstractVector)` returning a newly-allocated matrix containing the +Jacobian, `not_in_place` can be used to construct an object of type +`DifferentiableMultivariateFunction` that can be used as first argument of +`nlsolve`: + +```jl +nlsolve(not_in_place(f, g), initial_x) +``` + +For functions `f` and `g` that operate on arbitrary arrays the syntax is: + +```jl +nlsolve(not_in_place(f, g, initial_x), initial_x) +``` + +If, in addition to `f` and `g`, there is a function `fg` returning a tuple of a +newly-allocated vector of residuals and a newly-allocated matrix of the +Jacobian, `not_in_place` can be used to construct an object of type +`DifferentiableMultivariateFunction`: + +```jl +nlsolve(not_in_place(f, g, fg), initial_x) +``` + +For functions `f`, `g`, and `fg` that operate on arbitrary arrays the syntax is: + +```jl +nlsolve(not_in_place(f, g, fg, initial_x), initial_x) ``` ## With functions taking several scalar arguments If you have a function `f(x::Float64, y::Float64, ...)` that takes the point of interest as several scalars and returns a vector or a tuple containing the -residuals, you can use the helper function `n_ary`. The complete syntax is therefore: +residuals, you can use the helper function `n_ary`. The complete syntax is +therefore: ```jl nlsolve(n_ary(f), initial_x) @@ -198,13 +271,20 @@ Finite-differencing is used to compute the Jacobian. If the Jacobian of your function is sparse, it is possible to ask the routines to manipulate sparse matrices instead of full ones, in order to increase performance on large systems. This can be achieved by constructing an object of -type `DifferentiableSparseMultivariateFunction`: +type `DifferentiableSparseMultivariateFunction`. The syntax is therefore ```jl df = DifferentiableSparseMultivariateFunction(f!, g!) nlsolve(df, initial_x) ``` +if `f!` and `g!` operate on vectors, and otherwise + +```jl +df = DifferentiableSparseMultivariateFunction(f!, g!, initial_x) +nlsolve(df, initial_x) +``` + It is possible to give an optional third function `fg!` to the constructor, as for the full Jacobian case. @@ -226,14 +306,24 @@ empty!(gx.rowval) empty!(gx.nzval) ``` -Another solution is to directly pass a Jacobian matrix with a given sparsity. To do so, construct an object of type `DifferentiableGivenSparseMultivariateFunction` +Another solution is to directly pass a Jacobian matrix with a given sparsity. To +do so, construct an object of type +`DifferentiableGivenSparseMultivariateFunction` by ```jl df = DifferentiableGivenSparseMultivariateFunction(f!, g!, J) nlsolve(df, initial_x) ``` -If `g!` conserves the sparsity structure of `gx`, `gx` will always have the same sparsity as `J`. This sometimes allow to write a faster version of `g!`. +if `f!` and `g!` operate on vectors, and otherwise + +```jl +df = DifferentiableGivenSparseMultivariateFunction(f!, g!, J, initial_x) +nlsolve(df, initial_x) +``` + +If `g!` conserves the sparsity structure of `gx`, `gx` will always have the same +sparsity as `J`. This sometimes allow to write a faster version of `g!`. # Fine tunings diff --git a/REQUIRE b/REQUIRE index 4cd0a56..0dabb98 100644 --- a/REQUIRE +++ b/REQUIRE @@ -2,6 +2,6 @@ julia 0.6.0-pre Calculus Distances ForwardDiff 0.5.0 -LineSearches +LineSearches 2.1.0 DiffBase NLSolversBase 2.1.0 diff --git a/src/anderson.jl b/src/anderson.jl index d4b950d..99a6b37 100644 --- a/src/anderson.jl +++ b/src/anderson.jl @@ -2,15 +2,15 @@ # Attempts to accelerate the iteration xn+1 = xn + β f(x) @views function anderson_{T}(df::AbstractDifferentiableMultivariateFunction, - x0::Vector{T}, - xtol::T, - ftol::T, - iterations::Integer, - store_trace::Bool, - show_trace::Bool, - extended_trace::Bool, - m::Integer, - β :: Real) + x0::AbstractArray{T}, + xtol::T, + ftol::T, + iterations::Integer, + store_trace::Bool, + show_trace::Bool, + extended_trace::Bool, + m::Integer, + β::Real) f_calls = 0 N = length(x0) @@ -18,8 +18,8 @@ gs = zeros(T,N,m+1) #ring buffer storing the g of the iterates, from newest to oldest residuals = zeros(T, N, m) #matrix of residuals used for the least-squares problem alphas = zeros(T, m) #coefficients obtained by least-squares - fx = similar(x0) # temp variable to store f! - xs[:,1] = x0 + fx = similar(x0, N) # temp variable to store f! + xs[:,1] = vec(x0) errs = zeros(iterations) n = 1 tr = SolverTrace() @@ -36,7 +36,7 @@ # FIXME: How should this flag be set? mayterminate = false - + if tracing dt = Dict() if extended_trace @@ -49,9 +49,9 @@ n > 1 ? sqeuclidean(xs[:,1],old_x) : convert(T,NaN), dt, store_trace, - show_trace) + show_trace) end - + if converged break end @@ -76,14 +76,16 @@ # returning gs[:,1] rather than xs[:,1] would be better here if # xn+1 = xn+beta*f(xn) is convergent, but the convergence # criterion is not guaranteed + x = similar(x0) + copy!(x, xs[:,1]) return SolverResults("Anderson m=$m β=$β", - x0, copy(xs[:,1]), maximum(abs,fx), + x0, x, maximum(abs,fx), n, x_converged, xtol, f_converged, ftol, tr, f_calls, 0) end function anderson{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}, + initial_x::AbstractArray{T}, xtol::Real, ftol::Real, iterations::Integer, diff --git a/src/autodiff.jl b/src/autodiff.jl index fad4477..fec4dbc 100644 --- a/src/autodiff.jl +++ b/src/autodiff.jl @@ -1,16 +1,19 @@ -function autodiff(f!, initial_x::Vector) +function autodiff(f!, initial_x::AbstractArray) - permf! = (fx, x) -> f!(x, fx) + fvec! = reshape_f(f!, initial_x) + permf! = (fx::AbstractVector, x::AbstractVector) -> fvec!(x, fx) - fx2 = copy(initial_x) - jac_cfg = ForwardDiff.JacobianConfig(nothing, initial_x, initial_x) - g! = (x, gx) -> ForwardDiff.jacobian!(gx, permf!, fx2, x, jac_cfg) + fx2 = vec(copy(initial_x)) + jac_cfg = ForwardDiff.JacobianConfig(nothing, vec(initial_x), vec(initial_x)) + function g!(x::AbstractVector, gx::AbstractMatrix) + ForwardDiff.jacobian!(gx, permf!, fx2, x, jac_cfg) + end - fg! = (x, fx, gx) -> begin + fg! = (x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) -> begin jac_res = DiffBase.DiffResult(fx, gx) ForwardDiff.jacobian!(jac_res, permf!, fx2, x, jac_cfg) DiffBase.value(jac_res) end - return DifferentiableMultivariateFunction(f!, g!, fg!) + return DifferentiableMultivariateFunction(fvec!, g!, fg!) end diff --git a/src/differentiable_functions.jl b/src/differentiable_functions.jl index 2bc1459..a176a36 100644 --- a/src/differentiable_functions.jl +++ b/src/differentiable_functions.jl @@ -1,91 +1,144 @@ abstract type AbstractDifferentiableMultivariateFunction end -struct DifferentiableMultivariateFunction <: AbstractDifferentiableMultivariateFunction - f!::Function - g!::Function - fg!::Function +struct DifferentiableMultivariateFunction{F1,F2,F3} <: AbstractDifferentiableMultivariateFunction + f!::F1 + g!::F2 + fg!::F3 end alloc_jacobian(df::DifferentiableMultivariateFunction, T::Type, n::Integer) = Array{T}(n, n) -function DifferentiableMultivariateFunction(f!::Function, g!::Function) - function fg!(x::Vector, fx::Vector, gx::Array) +function DifferentiableMultivariateFunction(f!::F1, g!::F2, + initial_x::AbstractArray) where {F1,F2} + return DifferentiableMultivariateFunction(reshape_f(f!, initial_x), + reshape_g(g!, initial_x)) +end + +function DifferentiableMultivariateFunction(f!, g!, fg!, initial_x::AbstractArray) + return DifferentiableMultivariateFunction(reshape_f(f!, initial_x), + reshape_g(g!, initial_x), + reshape_fg(fg!, initial_x)) +end + +function DifferentiableMultivariateFunction(f!, g!) + function fg!(x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) f!(x, fx) g!(x, gx) end return DifferentiableMultivariateFunction(f!, g!, fg!) end -function DifferentiableMultivariateFunction(f!::Function; dtype::Symbol=:central) - function fg!(x::Vector, fx::Vector, gx::Array) +function DifferentiableMultivariateFunction(f!; dtype::Symbol=:central) + function fg!(x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) f!(x, fx) - function f(x::Vector) + function f(x::AbstractVector) fx = similar(x) f!(x, fx) return fx end finite_difference_jacobian!(f, x, fx, gx, dtype) end - function g!(x::Vector, gx::Array) + function g!(x::AbstractVector, gx::AbstractMatrix) fx = similar(x) fg!(x, fx, gx) end return DifferentiableMultivariateFunction(f!, g!, fg!) end +function DifferentiableMultivariateFunction(f!, initial_x::AbstractArray; + dtype::Symbol=:central) + return DifferentiableMultivariateFunction(reshape_f(f!, initial_x); dtype=dtype) +end + # Helper for the case where only f! and fg! are available -function only_f!_and_fg!(f!::Function, fg!::Function) - function g!(x::Vector, gx::Array) +function only_f!_and_fg!(f!, fg!) + function g!(x::AbstractVector, gx::AbstractMatrix) fx = similar(x) fg!(x, fx, gx) end return DifferentiableMultivariateFunction(f!, g!, fg!) end +function only_f!_and_fg!(f!, fg!, initial_x::AbstractArray) + return only_f!_and_fg!(reshape_f(f!, initial_x), + reshape_fg(fg!, initial_x)) +end + # Helper for the case where only fg! is available -function only_fg!(fg!::Function) - function f!(x::Vector, fx::Vector) +function only_fg!(fg!) + function f!(x::AbstractVector, fx::AbstractVector) gx = Array{eltype(x)}(length(x), length(x)) fg!(x, fx, gx) end - function g!(x::Vector, gx::Array) + function g!(x::AbstractVector, gx::AbstractMatrix) fx = similar(x) fg!(x, fx, gx) end return DifferentiableMultivariateFunction(f!, g!, fg!) end +function only_fg!(fg!, initial_x::AbstractArray) + return only_fg!(reshape_fg(fg!, initial_x)) +end + # For sparse Jacobians -struct DifferentiableSparseMultivariateFunction <: AbstractDifferentiableMultivariateFunction - f!::Function - g!::Function - fg!::Function +struct DifferentiableSparseMultivariateFunction{F1,F2,F3} <: AbstractDifferentiableMultivariateFunction + f!::F1 + g!::F2 + fg!::F3 end alloc_jacobian(df::DifferentiableSparseMultivariateFunction, T::Type, n::Integer) = spzeros(T, n, n) -function DifferentiableSparseMultivariateFunction(f!::Function, g!::Function) - function fg!(x::Vector, fx::Vector, gx::SparseMatrixCSC) +function DifferentiableSparseMultivariateFunction(f!::F1, g!::F2, + initial_x::AbstractArray) where {F1,F2} + return DifferentiableSparseMultivariateFunction(reshape_f(f!, initial_x), + reshape_g_sparse(g!, initial_x)) +end + +function DifferentiableSparseMultivariateFunction(f!, g!, fg!, initial_x::AbstractArray) + return DifferentiableSparseMultivariateFunction(reshape_f(f!, initial_x), + reshape_g_sparse(g!, initial_x), + reshape_fg_sparse(fg!, initial_x)) +end + +function DifferentiableSparseMultivariateFunction(f!, g!) + function fg!(x::AbstractVector, fx::AbstractVector, gx::SparseMatrixCSC) f!(x, fx) g!(x, gx) end return DifferentiableSparseMultivariateFunction(f!, g!, fg!) end - -struct DifferentiableGivenSparseMultivariateFunction{Tv, Ti} <: AbstractDifferentiableMultivariateFunction - f!::Function - g!::Function - fg!::Function +struct DifferentiableGivenSparseMultivariateFunction{F1,F2,F3,Tv,Ti} <: AbstractDifferentiableMultivariateFunction + f!::F1 + g!::F2 + fg!::F3 J::SparseMatrixCSC{Tv, Ti} end alloc_jacobian(df::DifferentiableGivenSparseMultivariateFunction, args...) = deepcopy(df.J) -function DifferentiableGivenSparseMultivariateFunction(f!::Function, g!::Function, J::SparseMatrixCSC) - function fg!(x::Vector, fx::Vector, gx::SparseMatrixCSC) +function DifferentiableGivenSparseMultivariateFunction(f!, g!, fg!, J::SparseMatrixCSC, + initial_x::AbstractArray) + return DifferentiableGivenSparseMultivariateFunction(reshape_f(f!, initial_x), + reshape_g_sparse(g!, initial_x), + reshape_fg_sparse(fg!, initial_x), + J) +end + +function DifferentiableGivenSparseMultivariateFunction(f!::F1, g!::F2, + J::SparseMatrixCSC) where {F1,F2} + function fg!(x::AbstractVector, fx::AbstractVector, gx::SparseMatrixCSC) f!(x, fx) g!(x, gx) end DifferentiableGivenSparseMultivariateFunction(f!, g!, fg!, J) end + +function DifferentiableGivenSparseMultivariateFunction(f!, g!, J::SparseMatrixCSC, + initial_x::AbstractArray) + return DifferentiableGivenSparseMultivariateFunction(reshape_f(f!, initial_x), + reshape_g_sparse(g!, initial_x), + J) +end diff --git a/src/mcp.jl b/src/mcp.jl index fc31ab2..cd4f9cd 100644 --- a/src/mcp.jl +++ b/src/mcp.jl @@ -13,7 +13,7 @@ function mcp_smooth(df::AbstractDifferentiableMultivariateFunction, lower::Vector, upper::Vector) - function f!(x::Vector, fx::Vector) + function f!(x::AbstractVector, fx::AbstractVector) df.f!(x, fx) for i = 1:length(x) if isfinite.(upper[i]) @@ -25,7 +25,7 @@ function mcp_smooth(df::AbstractDifferentiableMultivariateFunction, end end - function g!(x::Vector, gx::AbstractArray) + function g!(x::AbstractVector, gx::AbstractMatrix) fx = similar(x) df.fg!(x, fx, gx) @@ -73,9 +73,11 @@ function mcp_smooth(df::AbstractDifferentiableMultivariateFunction, end end - dfT = typeof(df) - - return dfT(f!, g!) + if typeof(df) <: DifferentiableSparseMultivariateFunction + return DifferentiableSparseMultivariateFunction(f!, g!) + else + return DifferentiableMultivariateFunction(f!, g!) + end end # Generate a function whose roots are the solutions of the MCP. @@ -88,7 +90,7 @@ end function mcp_minmax(df::AbstractDifferentiableMultivariateFunction, lower::Vector, upper::Vector) - function f!(x::Vector, fx::Vector) + function f!(x::AbstractVector, fx::AbstractVector) df.f!(x, fx) for i = 1:length(x) if fx[i] < x[i]-upper[i] @@ -100,7 +102,7 @@ function mcp_minmax(df::AbstractDifferentiableMultivariateFunction, end end - function g!(x::Vector, gx::AbstractArray) + function g!(x::AbstractVector, gx::AbstractMatrix) fx = similar(x) df.fg!(x, fx, gx) for i = 1:length(x) @@ -112,7 +114,9 @@ function mcp_minmax(df::AbstractDifferentiableMultivariateFunction, end end - dfT = typeof(df) - - return dfT(f!, g!) + if typeof(df) <: DifferentiableSparseMultivariateFunction + return DifferentiableSparseMultivariateFunction(f!, g!) + else + return DifferentiableMultivariateFunction(f!, g!) + end end diff --git a/src/mcp_func_defs.jl b/src/mcp_func_defs.jl index 59323be..92666ea 100644 --- a/src/mcp_func_defs.jl +++ b/src/mcp_func_defs.jl @@ -15,7 +15,7 @@ end function mcpsolve{T}(df::AbstractDifferentiableMultivariateFunction, lower::Vector, upper::Vector, - initial_x::Vector{T}; + initial_x::AbstractArray{T}; method::Symbol = :trust_region, reformulation::Symbol = :smooth, xtol::Real = zero(T), @@ -24,7 +24,7 @@ function mcpsolve{T}(df::AbstractDifferentiableMultivariateFunction, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = LineSearches.backtracking!, + linesearch! = LineSearches.BackTracking(), factor::Real = one(T), autoscale::Bool = true) @reformulate df @@ -35,11 +35,11 @@ function mcpsolve{T}(df::AbstractDifferentiableMultivariateFunction, linesearch! = linesearch!, factor = factor, autoscale = autoscale) end -function mcpsolve{T}(f!::Function, - g!::Function, +function mcpsolve{T}(f!, + g!, lower::Vector, upper::Vector, - initial_x::Vector{T}; + initial_x::AbstractArray{T}; method::Symbol = :trust_region, reformulation::Symbol = :smooth, xtol::Real = zero(T), @@ -48,10 +48,10 @@ function mcpsolve{T}(f!::Function, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = LineSearches.backtracking!, + linesearch! = LineSearches.BackTracking(), factor::Real = one(T), autoscale::Bool = true) - @reformulate DifferentiableMultivariateFunction(f!, g!) + @reformulate DifferentiableMultivariateFunction(f!, g!, initial_x) nlsolve(rf, initial_x, method = method, xtol = xtol, ftol = ftol, iterations = iterations, store_trace = store_trace, @@ -59,10 +59,10 @@ function mcpsolve{T}(f!::Function, linesearch! = linesearch!, factor = factor, autoscale = autoscale) end -function mcpsolve{T}(f!::Function, +function mcpsolve{T}(f!, lower::Vector, upper::Vector, - initial_x::Vector{T}; + initial_x::AbstractArray{T}; method::Symbol = :trust_region, reformulation::Symbol = :smooth, xtol::Real = zero(T), @@ -71,12 +71,12 @@ function mcpsolve{T}(f!::Function, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = LineSearches.backtracking!, + linesearch! = LineSearches.BackTracking(), factor::Real = one(T), autoscale::Bool = true, autodiff::Bool = false) if !autodiff - df = DifferentiableMultivariateFunction(f!) + df = DifferentiableMultivariateFunction(f!, initial_x) else df = NLsolve.autodiff(f!, initial_x) end diff --git a/src/newton.jl b/src/newton.jl index cf8ed89..1cad6cc 100644 --- a/src/newton.jl +++ b/src/newton.jl @@ -27,16 +27,16 @@ macro newtontrace(stepnorm) end function newton_{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}, - xtol::T, - ftol::T, - iterations::Integer, - store_trace::Bool, - show_trace::Bool, - extended_trace::Bool, - linesearch!::Function) - - x = copy(initial_x) + initial_x::AbstractArray{T}, + xtol::T, + ftol::T, + iterations::Integer, + store_trace::Bool, + show_trace::Bool, + extended_trace::Bool, + linesearch!) + + x = vec(copy(initial_x)) nn = length(x) xold = fill(convert(T, NaN), nn) fvec = Array{T}(nn) @@ -71,7 +71,7 @@ function newton_{T}(df::AbstractDifferentiableMultivariateFunction, # Create objective function for the linesearch. # This function is defined as fo(x) = 0.5 * f(x) ⋅ f(x) and thus # has the gradient ∇fo(x) = ∇f(x) ⋅ f(x) - function fo(xlin::Vector) + function fo(xlin::AbstractVector) if xlin != xold df.f!(xlin, fvec) f_calls += 1 @@ -84,7 +84,7 @@ function newton_{T}(df::AbstractDifferentiableMultivariateFunction, # is expensive to recompute. # We solve this using the already computed ∇f(xₖ) # in case of the line search asking us for the gradient at xₖ. - function go!(storage::Vector, xlin::Vector) + function go!(storage::AbstractVector, xlin::AbstractVector) if xlin == xold At_mul_B!(storage, fjac, fvec) # Else we need to recompute it. @@ -95,12 +95,12 @@ function newton_{T}(df::AbstractDifferentiableMultivariateFunction, At_mul_B!(storage, fjac, fvec) end end - function fgo!(storage::Vector, xlin::Vector) + function fgo!(storage::AbstractVector, xlin::AbstractVector) go!(storage, xlin) dot(fvec, fvec) / 2 end - dfo = OnceDifferentiable(fo, go!, fgo!, initial_x) + dfo = OnceDifferentiable(fo, go!, fgo!, x) while !converged && it < iterations @@ -142,19 +142,19 @@ function newton_{T}(df::AbstractDifferentiableMultivariateFunction, end return SolverResults("Newton with line-search", - initial_x, x, norm(fvec, Inf), + initial_x, reshape(x, size(initial_x)...), norm(fvec, Inf), it, x_converged, xtol, f_converged, ftol, tr, f_calls, g_calls) end function newton{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}, + initial_x::AbstractArray{T}, xtol::Real, ftol::Real, iterations::Integer, store_trace::Bool, show_trace::Bool, extended_trace::Bool, - linesearch!::Function) + linesearch!) newton_(df, initial_x, convert(T, xtol), convert(T, ftol), iterations, store_trace, show_trace, extended_trace, linesearch!) end diff --git a/src/nlsolve_func_defs.jl b/src/nlsolve_func_defs.jl index cfbf18c..1d35069 100644 --- a/src/nlsolve_func_defs.jl +++ b/src/nlsolve_func_defs.jl @@ -1,5 +1,5 @@ function nlsolve{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}; + initial_x::AbstractArray{T}; method::Symbol = :trust_region, xtol::Real = zero(T), ftol::Real = convert(T,1e-8), @@ -7,7 +7,7 @@ function nlsolve{T}(df::AbstractDifferentiableMultivariateFunction, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = no_linesearch!, + linesearch! = no_linesearch!, factor::Real = one(T), autoscale::Bool = true, m::Integer = 0, @@ -34,9 +34,9 @@ function nlsolve{T}(df::AbstractDifferentiableMultivariateFunction, end end -function nlsolve{T}(f!::Function, - g!::Function, - initial_x::Vector{T}; +function nlsolve{T}(f!, + g!, + initial_x::AbstractArray{T}; method::Symbol = :trust_region, xtol::Real = zero(T), ftol::Real = convert(T, 1e-8), @@ -44,12 +44,12 @@ function nlsolve{T}(f!::Function, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = no_linesearch!, + linesearch! = no_linesearch!, factor::Real = one(T), autoscale::Bool = true, m::Integer = 0, beta::Real = 1.0) - nlsolve(DifferentiableMultivariateFunction(f!, g!), + nlsolve(DifferentiableMultivariateFunction(f!, g!, initial_x), initial_x, method = method, xtol = xtol, ftol = ftol, iterations = iterations, store_trace = store_trace, show_trace = show_trace, extended_trace = extended_trace, @@ -57,8 +57,8 @@ function nlsolve{T}(f!::Function, m = m, beta = beta) end -function nlsolve{T}(f!::Function, - initial_x::Vector{T}; +function nlsolve{T}(f!, + initial_x::AbstractArray{T}; method::Symbol = :trust_region, xtol::Real = zero(T), ftol::Real = convert(T,1e-8), @@ -66,14 +66,14 @@ function nlsolve{T}(f!::Function, store_trace::Bool = false, show_trace::Bool = false, extended_trace::Bool = false, - linesearch!::Function = no_linesearch!, + linesearch! = no_linesearch!, factor::Real = one(T), autoscale::Bool = true, m::Integer = 0, beta::Real = 1.0, autodiff::Bool = false) if !autodiff - df = DifferentiableMultivariateFunction(f!) + df = DifferentiableMultivariateFunction(f!, initial_x) else df = NLsolve.autodiff(f!, initial_x) end diff --git a/src/solver_state_results.jl b/src/solver_state_results.jl index 0a3a54b..7578d97 100644 --- a/src/solver_state_results.jl +++ b/src/solver_state_results.jl @@ -65,10 +65,10 @@ function update!(tr::SolverTrace, return end -mutable struct SolverResults{T} +mutable struct SolverResults{T,I<:AbstractArray{T},Z<:AbstractArray{T}} method::String - initial_x::Vector{T} - zero::Vector{T} + initial_x::I + zero::Z residual_norm::T iterations::Int x_converged::Bool diff --git a/src/trust_region.jl b/src/trust_region.jl index 4762f29..2eddf4d 100644 --- a/src/trust_region.jl +++ b/src/trust_region.jl @@ -20,7 +20,8 @@ macro trustregiontrace(stepnorm) end) end -function dogleg!{T}(p::Vector{T}, r::Vector{T}, d::Vector{T}, J::AbstractMatrix{T}, delta::Real) +function dogleg!{T}(p::AbstractVector{T}, r::AbstractVector{T}, d::AbstractVector{T}, + J::AbstractMatrix{T}, delta::Real) local p_i try p_i = J \ r # Gauss-Newton step @@ -79,17 +80,17 @@ function dogleg!{T}(p::Vector{T}, r::Vector{T}, d::Vector{T}, J::AbstractMatrix{ end function trust_region_{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}, - xtol::T, - ftol::T, - iterations::Integer, - store_trace::Bool, - show_trace::Bool, - extended_trace::Bool, - factor::T, - autoscale::Bool) - - x = copy(initial_x) # Current point + initial_x::AbstractArray{T}, + xtol::T, + ftol::T, + iterations::Integer, + store_trace::Bool, + show_trace::Bool, + extended_trace::Bool, + factor::T, + autoscale::Bool) + + x = vec(copy(initial_x)) # Current point nn = length(x) xold = fill(convert(T, NaN), nn) # Old point r = similar(x) # Current residual @@ -188,13 +189,13 @@ function trust_region_{T}(df::AbstractDifferentiableMultivariateFunction, name *= " and autoscaling" end return SolverResults(name, - initial_x, x, norm(r, Inf), + initial_x, reshape(x, size(initial_x)...), norm(r, Inf), it, x_converged, xtol, f_converged, ftol, tr, f_calls, g_calls) end function trust_region{T}(df::AbstractDifferentiableMultivariateFunction, - initial_x::Vector{T}, + initial_x::AbstractArray{T}, xtol::Real, ftol::Real, iterations::Integer, diff --git a/src/utils.jl b/src/utils.jl index 96e5497..5ea31be 100644 --- a/src/utils.jl +++ b/src/utils.jl @@ -29,7 +29,7 @@ function assess_convergence(x::AbstractArray, return x_converged, f_converged, converged end -function check_isfinite(x::Vector) +function check_isfinite(x::AbstractVector) i = find((!).(isfinite.(x))) if !isempty(i) throw(IsFiniteException(i)) @@ -37,24 +37,97 @@ function check_isfinite(x::Vector) end # Helpers for functions that do not modify arguments in place but return -function not_in_place(f::Function) - f!(x::Vector, y::AbstractArray) = copy!(y, f(x)) +function not_in_place(f) + function f!(x::AbstractVector, fx::AbstractVector) + copy!(fx, f(x)) + end +end + +function not_in_place(f, initial_x::AbstractArray) + function fvec!(x::AbstractVector, fx::AbstractVector) + copy!(reshape(fx, size(initial_x)...), f(reshape(x, size(initial_x)...))) + end +end +function not_in_place(f, g) + DifferentiableMultivariateFunction(not_in_place(f), not_in_place_g(g)) +end + +function not_in_place(f, g, initial_x::AbstractArray) + DifferentiableMultivariateFunction(not_in_place(f, initial_x), + not_in_place_g(g, initial_x)) +end + +function not_in_place(f, g, fg) + DifferentiableMultivariateFunction(not_in_place(f), not_in_place_g(g), + not_in_place_fg(fg)) end -function not_in_place(f::Function, g::Function) - DifferentiableMultivariateFunction(not_in_place(f), not_in_place(g)) +function not_in_place(f, g, fg, initial_x::AbstractArray) + DifferentiableMultivariateFunction(not_in_place(f, initial_x), + not_in_place_g(g, initial_x), + not_in_place_fg(fg, initial_x)) end -function not_in_place(f::Function, g::Function, fg::Function) - function fg!(x::Vector, fx::Vector, gx::Array) +function not_in_place_g(g) + function g!(x::AbstractVector, gx::AbstractMatrix) + copy!(gx, g(x)) + end +end + +function not_in_place_g(g, initial_x::AbstractArray) + function g!(x::AbstractVector, gx::AbstractMatrix) + copy!(gx, g(reshape(x, size(initial_x)...))) + end +end + +function not_in_place_fg(fg) + function fg!(x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) (fvec, fjac) = fg(x) copy!(fx, fvec) copy!(gx, fjac) end - DifferentiableMultivariateFunction(not_in_place(f), not_in_place(g), fg!) +end + +function not_in_place_fg(fg, initial_x::AbstractArray) + function fg!(x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) + (fvec, fjac) = fg(reshape(x, size(initial_x)...)) + copy!(reshape(fx, size(initial_x)...), fvec) + copy!(gx, fjac) + end end # Helper for functions that take several scalar arguments and return a tuple -function n_ary(f::Function) +function n_ary(f) f!(x::Vector, fx::AbstractArray) = copy!(fx, [f(x...)... ]) end + +# Helpers for reshaping functions on arbitrary arrays to functions on vectors +function reshape_f(f!, initial_x::AbstractArray) + function fvec!(x::AbstractVector, fx::AbstractVector) + f!(reshape(x, size(initial_x)...), reshape(fx, size(initial_x)...)) + end +end + +function reshape_g(g!, initial_x::AbstractArray) + function gvec!(x::AbstractVector, gx::AbstractMatrix) + g!(reshape(x, size(initial_x)...), gx) + end +end + +function reshape_g_sparse(g!, initial_x::AbstractArray) + function gvec!(x::AbstractVector, gx::SparseMatrixCSC) + g!(reshape(x, size(initial_x)...), gx) + end +end + +function reshape_fg(fg!, initial_x::AbstractArray) + function fgvec!(x::AbstractVector, fx::AbstractVector, gx::AbstractMatrix) + fg!(reshape(x, size(initial_x)...), reshape(fx, size(initial_x)...), gx) + end +end + +function reshape_fg_sparse(fg!, initial_x::AbstractArray) + function fgvec!(x::AbstractVector, fx::AbstractVector, gx::SparseMatrixCSC) + fg!(reshape(x, size(initial_x)...), reshape(fx, size(initial_x)...), gx) + end +end diff --git a/test/2by2.jl b/test/2by2.jl index 5259cf7..c1ab1c5 100644 --- a/test/2by2.jl +++ b/test/2by2.jl @@ -36,13 +36,19 @@ r = nlsolve(df, [ -0.5f0; 1.4f0], method = :trust_region, autoscale = false) @test norm(r.zero - [ 0; 1]) < 1e-7 # Test Newton -r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.backtracking!, ftol = 1e-6) +r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.BackTracking(), ftol = 1e-6) @test converged(r) @test norm(r.zero - [ 0; 1]) < 1e-6 -r = nlsolve(df, [ -0.5f0; 1.4f0], method = :newton, linesearch! = LineSearches.backtracking!, ftol = 1e-3) +r = nlsolve(df, [ -0.5f0; 1.4f0], method = :newton, linesearch! = LineSearches.BackTracking(), ftol = 1e-3) @test eltype(r.zero) == Float32 @test converged(r) @test norm(r.zero - [ 0; 1]) < 1e-6 +r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.HagerZhang(), ftol = 1e-6) +@test converged(r) +@test norm(r.zero - [ 0; 1]) < 1e-6 +r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.StrongWolfe(), ftol = 1e-6) +@test converged(r) +@test norm(r.zero - [ 0; 1]) < 1e-6 # test local convergence of Anderson: close to a fixed-point and with # a small beta, f should be almost affine, in which case Anderson is @@ -50,12 +56,4 @@ r = nlsolve(df, [ -0.5f0; 1.4f0], method = :newton, linesearch! = LineSearches.b r = nlsolve(df, [ 0.01; .99], method = :anderson, m = 10, beta=.01) @test converged(r) @test norm(r.zero - [ 0; 1]) < 1e-8 - -# Tests of other lineasearches are disabled, they are not stable across runs -#r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.hz_linesearch!, ftol = 1e-6) -#@test converged(r) -#@test norm(r.zero - [ 0; 1]) < 1e-6 -#r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.interpolating_linesearch!, ftol = 1e-6) -#@test converged(r) -#@test norm(r.zero - [ 0; 1]) < 1e-6 end diff --git a/test/REQUIRE b/test/REQUIRE new file mode 100644 index 0000000..4b0d310 --- /dev/null +++ b/test/REQUIRE @@ -0,0 +1 @@ +DiffEqBase 1.23.0 diff --git a/test/abstractarray.jl b/test/abstractarray.jl new file mode 100644 index 0000000..28e400f --- /dev/null +++ b/test/abstractarray.jl @@ -0,0 +1,65 @@ +# test use of arbitrary arrays with Wood example from MINPACK + +@testset "abstractarray" begin + +const c3 = 2e2 +const c4 = 2.02e1 +const c5 = 1.98e1 +const c6 = 1.8e2 + +function f!(x, fvec) + temp1 = x[2] - x[1]^2 + temp2 = x[4] - x[3]^2 + fvec[1] = -c3*x[1]*temp1 - (1 - x[1]) + fvec[2] = c3*temp1 + c4*(x[2] - 1) + c5*(x[4] - 1) + fvec[3] = -c6*x[3]*temp2 - (1 - x[3]) + fvec[4] = c6*temp2 + c4*(x[4] - 1) + c5*(x[2] - 1) +end + +function g!(x, fjac) + fill!(fjac, 0) + temp1 = x[2] - 3x[1]^2 + temp2 = x[4] - 3x[3]^2 + fjac[1,1] = -c3*temp1 + 1 + fjac[1,2] = -c3*x[1] + fjac[2,1] = -2*c3*x[1] + fjac[2,2] = c3 + c4 + fjac[2,4] = c5 + fjac[3,3] = -c6*temp2 + 1 + fjac[3,4] = -c6*x[3] + fjac[4,2] = c5 + fjac[4,3] = -2*c6*x[3] + fjac[4,4] = c6 + c4 +end + +initial_x_matrix = [-3. -3; -1 -1] +initial_x = vec(initial_x_matrix) +initial_x_wrapped = WrappedArray(initial_x_matrix) + +for method in (:trust_region, :newton, :anderson) + r = nlsolve(f!, g!, initial_x, method = method) + r_matrix = nlsolve(f!, g!, initial_x_matrix, method = method) + r_wrapped = nlsolve(f!, g!, initial_x_wrapped, method = method) + + @test r.zero == vec(r_matrix.zero) + @test r_matrix.zero == r_wrapped.zero + @test r.residual_norm == r_matrix.residual_norm + @test r.residual_norm == r_wrapped.residual_norm + @test typeof(r.zero) == typeof(initial_x) + @test typeof(r_matrix.zero) == typeof(initial_x_matrix) + @test typeof(r_wrapped.zero) == typeof(initial_x_wrapped) + + r_AD = nlsolve(f!, initial_x, method = method, autodiff = true) + r_matrix_AD = nlsolve(f!, initial_x_matrix, method = method, autodiff = true) + r_wrapped_AD = nlsolve(f!, initial_x_wrapped, method = method, autodiff = true) + + @test r_AD.zero == vec(r_matrix_AD.zero) + @test r_matrix_AD.zero == r_wrapped_AD.zero + @test r_AD.residual_norm == r_matrix_AD.residual_norm + @test r_AD.residual_norm == r_wrapped_AD.residual_norm + @test typeof(r_AD.zero) == typeof(initial_x) + @test typeof(r_matrix_AD.zero) == typeof(initial_x_matrix) + @test typeof(r_wrapped_AD.zero) == typeof(initial_x_wrapped) +end + +end diff --git a/test/iface.jl b/test/iface.jl index 45f35ea..b1a7c46 100644 --- a/test/iface.jl +++ b/test/iface.jl @@ -87,7 +87,7 @@ r = nlsolve(not_in_place(f), [ -0.5; 1.4]) r = nlsolve(not_in_place(f), [ -0.5; 1.4], autodiff = true) @test converged(r) -r = nlsolve(not_in_place(f), not_in_place(g), [ -0.5; 1.4]) +r = nlsolve(not_in_place(f, g), [ -0.5; 1.4]) @test converged(r) r = nlsolve(not_in_place(f, g, fg), [ -0.5; 1.4]) diff --git a/test/runtests.jl b/test/runtests.jl index cece335..b48d2ba 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -1,8 +1,15 @@ using NLsolve +using DiffEqBase using Base.Test add_jl(x) = endswith(x, ".jl") ? x : x*".jl" +type WrappedArray{T,N} <: DEDataArray{T,N} + x::Array{T,N} +end + +Base.reshape(A::WrappedArray, dims::Int...) = WrappedArray(reshape(A.x, dims...)) + if length(ARGS) > 0 tests = map(add_jl, ARGS) else @@ -18,7 +25,8 @@ else "sparse.jl", "throws.jl", "f_g_counts.jl", - "no_linesearch.jl"] + "no_linesearch.jl", + "abstractarray.jl"] end println("Running tests:") diff --git a/test/sparse.jl b/test/sparse.jl index efd22d9..ad206a2 100644 --- a/test/sparse.jl +++ b/test/sparse.jl @@ -24,7 +24,7 @@ r = nlsolve(df, [ -0.5; 1.4], method = :trust_region, autoscale = true) @test norm(r.zero - [ 0; 1]) < 1e-8 # Test Newton -r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.backtracking!, ftol = 1e-6) +r = nlsolve(df, [ -0.5; 1.4], method = :newton, linesearch! = LineSearches.BackTracking(), ftol = 1e-6) @test converged(r) @test norm(r.zero - [ 0; 1]) < 1e-6