| |||||
Levenberg-Marquardt is a popular alternative to the Gauss-Newton method of finding the minimum of a function that is a sum of squares of nonlinear functions, Let the Jacobian of be denoted , then the Levenberg-Marquardt method searches in the direction given by the solution to the equations where are nonnegative scalars and is the identity matrix. The method has the nice property that, for some scalar related to , the vector is the solution of the constrained subproblem of minimizing subject to (Gill et al. 1981, p. 136). The method is used by the command FindMinimum[f, x, x0] when given the Method -> Levenberg Marquardt option. 窗体顶端 SEE ALSO: Minimum, Optimization 窗体底端 REFERENCES: Bates, D. M. and Watts, D. G. Nonlinear Regression and Its Applications. New York: Wiley, 1988. Gill, P. R.; Murray, W.; and Wright, M. H. "The Levenberg-Marquardt Method." §4.7.3 in Practical Optimization. London: Academic Press, pp. 136-137, 1981. Levenberg, K. "A Method for the Solution of Certain Problems in Least Squares." Quart. Appl. Math. 2, 164-168, 1944. Marquardt, D. "An Algorithm for Least-Squares Estimation of Nonlinear Parameters." SIAM J. Appl. Math. 11, 431-441, 1963. | |||||
Levenberg–Marquardt algorithm
From Wikipedia, the free encyclopedia
Jump to: navigation, search
In mathematics and computing, the Levenberg–Marquardt algorithm (LMA)[1] provides a numerical solution to the problem of minimizing a function, generally nonlinear, over a space of parameters of the function. These minimization problems arise especially in least squares curve fitting and nonlinear programming.
The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting parameters, the LMA tends to be a bit slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach.
The LMA is a very popular curve-fitting algorithm used in many software applications for s
olving generic curve-fitting problems. However, the LMA finds only a local minimum, not a global minimum.
Contents [hide] ∙ 1 Caveat Emptor ∙ 2 The problem ∙ 3 The solution o 3.1 Choice of damping parameter ∙ 4 Example ∙ 5 Notes ∙ 6 See also ∙ 7 References ∙ 8 External links o 8.1 Descriptions o 8.2 Implementations |
[edit] Caveat Emptor
One important limitation that is very often over-looked is that it only optimises for residual errors in the dependant variable (y). It thereby implicitly assumes that any errors in the independent variable are zero or at least ratio of the two is so small as to be negligible. This is not a defect, it is intentional, but it must be taken into account when deciding whether to use this technique to do a fit. While this may be suitable in context of a controlled experiment there are many situations where this assumption cannot be made. In such situations either non-least squares methods should be used or the least-squares fit should be done in proportion to the relative errors in the two variables, not simply the vertical "y" error. Failing to recognise this can lead to a fit which is significantly incorrect and fundamentally wrong. It will usually underestimate the slope. This may or may not be obvious to the eye.
MicroSoft Excel's chart offers a trend fit that has this limitation that is undocumented. Users often fall into this trap assuming the fit is correctly calculated for all situations. OpenOffice spreadsheet copied this feature and presents the same problem.
[edit] The problem
The primary application of the Levenberg–Marquardt algorithm is in the least squares curve fitting problem: given a set of m empirical datum pairs of independent and dependent variables, (xi, yi), optimize the parameters β of the model curve f(x,β) so that the sum of the squares of the deviations
becomes minimal.
[edit] The solution
Like other numeric minimization algorithms, the Levenberg–Marquardt algorithm is an iterative procedure. To start a minimization, the user has to provide an initial guess for the parameter vector, β. In many cases, an uninformed standard guess like βT=(1,1,...,1) will work fine; in other cases, the algorithm converges only if the initial guess is already somewhat close to the final solution.
In each iteration step, the parameter vector, β, is replaced by a new estimate, β + δ. To determine δ, the functions are approximated by their linearizations
where
is the gradient (row-vector in this case) of f with respect to β.
At its minimum, the sum of squares, S(β), the gradient of S with respect to δ will be zero. The above first-order approximation of gives
.
Or in vector notation,
.
Taking the derivative with respect to δ and setting the result to zero gives:
where is the Jacobian matrix whose ith row equals Ji, and where and are vectors with ith component and yi, respectively. This is a set of linear equations which can be solved for δ.
Levenberg's contribution is to replace this equation by a "damped version",minimal
where I is the identity matrix, giving as the increment, δ, to the estimated parameter vector, β.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论