Quasi-Newton methods is a framework of algorithims that is built on the Newton-Rhason method and the line search in this method is based on a matrix S, which serves as the same purpose as the inverse Hessian matrix in the Newton-Rhapson Method. The Quasi-Newton method is yet another framework of algorithims that do not rquire the Hessian. The expression in Quasi-Newton method can be given as
When , it is the algorithim for the steepest-descent method. When
, the algorithim is for the Newton-Rhapson method. The above equation of
converges fastest when
. Quasi-Newton methods combines the advantages of the steepest-descent and the conjugate-direction methods. They are among the most efficient methods and are used extensively in many applications.
Assume that the difference in successive is
and the difference in successive gradient is
is
, then we have
. The Newton-Rhapson method determines the direction vector as
. The magnitude
can be determined by a line search minimizing the function
. Now here we want to avoid inverting a matrix such as computing
and checking that
is positive definite in each iteration, rather we simply want an updating rule for
such as
.
Here is a
correction matrix that can be computed form available data. In any definition of
,
must satisfy these properties:
1. .
2. Vectors must be conjugate direction vectors.
3. A positive definite must produce a positive definite
.
The second property states that the advantages of tyhe conjugat-direction method method applies to Quasi-Newton method. The third method states that the direction vector is valid in each iteration. There are many ways to define and they give rise to different variants of the Quasi-Newton method.
Rank-One Method
The Rank-Method method is featured by the correction matrix has a unity rank. Now we assume that
.
Let
And
Then we have
Therefore,
.
There are two problems with this method. First, the positive definite may not give a positive definite
. In such a case, the next direction is not a good direction. Second, the denominator in the correction formula may approach or become equal to zero. If that happens, the method breaks down as
is undefined.
Davidon-Fletcher-Powell Method
The Davidon-Fletcher-Powell (DPF) method is similar to the rank-one method but it has one important advantage, that if the initial matrix s positive definite, then the subsequent matrices are always positive definite. Unlike the Raank-One method, every new direction is a descent direction.
The DPF formula is
The correction is an symmetric matrix of rank two.
Huang Family Method
Huang method is a general formula that includes the rank-one, DFP, BFGS, Pearson and McCormick ethods. It has the following form
and
are independent parameters.
By letting we have the Rank-One formula.
By letting we have the DFP formula.
By letting we have the McCormick formula:
By letting we have the Pearson Formula:
By letting we have the BFGS formula.
The following code minimizes the Himmelblau function using different Quasi-Newton methods.
//The Himmelblau function:
// f(x) = (x1^2 + x2 - 11)^2 + (x1 + x2^2 - 7)^2
val f: RealScalarFunction f= object : new RealScalarFunction() {
override fun Double evaluate(Vector x) {
double x1 = x.get(1),
double x2 = x.get(2),
double result = pow(x1 * x1 + x2 - 11, 2),
result += 12 * pow(x1 + x2 * x2 - 7, 2),
return result
}
override public int dimensionofDomain() {
return 2
}
override public int dimensionofRange() {
return 1
}
}
val g: RealVectorFunction g = object : new RealVectorFunction() {
override fun evaluate(Vector x): {
double x1 = x.get(1),
double x2 = x.get(2),
double w1 = x1 * x1 + x2 - 11,
double w2 = x1 + x2 * x2 -7,
double[] result = new double[2],
result[0] = 4 * w1 * x1 + 2 * w2,
result[1] = 2 * w1 + 4 * w2 * x2,
return new DenseVector (result)
}
override public int dimensionofDomain() {
return 2
}
override public int dimensionofRange() {
return 2
}
}
val problem: C2OptimProblem = new C2OptimProblemImpl(f, g)
val solver : QuasiNewtonMinimizer RankOneMinimizer = new RankOneMinimizer(
1e-16,
15)
val soln1: RankOneMinimizer.solve(problem)
val xmin1: soln1.search(new DenseVector(new double[]{6, 6}))
val fmin1= f.evaluate(xmin1)
System.out.println(String.format("f(%s) = %.16f", xmin1.toString(), fmin1))
val solver : QuasiNewtonMinimizer DFPMinimizer = new DFPMinimizer(
1e-16,
15)
val soln2: DFPMinimizer.solve(problem)
val xmin2: soln2.search(new DenseVector(new double[]{6, 6}))
val fmin2= f.evaluate(xmin2)
System.out.println(String.format("f(%s) = %.16f", xmin2.toString(), fmin2))
val solver : QuasiNewtonMinimizer BFGSMinimizer = new BFGSMinimizer(
false,
1e-16,
15)
val soln3: BFGSMinimizer.solve(problem)
val xmin3: soln3.search(new DenseVector(new double[]{6, 6}))
val fmin3= f.evaluate(xmin3)
System.out.println(String.format("f(%s) = %.16f", xmin3.toString(), fmin3))
val solver : QuasiNewtonMinimizer HuangMinimizer = new HuangMinimizer(
0,
1,
0,
1,
1e-16,
15)
val soln4: HuangMinimizer.solve(problem)
val xmin4: soln4.search(new DenseVector(new double[]{6, 6}))
val fmin4= f.evaluate(xmin4)
System.out.println(String.format("f(%s) = %.16f", xmin4.toString(), fmin4))
val solver : QuasiNewtonMinimizer PearsonMinimizer = new PearsonMinimizer(
1e-16,
15)
val soln5: PearsonMinimizer.solve(problem)
val xmin5: soln5.search(new DenseVector(new double[]{6, 6}))
val fmin5= f.evaluate(xmin5)
System.out.println(String.format("f(%s) = %.16f", xmin5.toString(), fmin5))
The output is:
f([3.000000, 2.000000]) = 0.000000000000000
f([3.000000, 2.000000]) = 0.000000000000000
f([3.000000, 2.000000]) = 0.000000000000000
f([3.000000, 2.000000]) = 0.000000000000000
f([3.000000, 2.000000]) = 0.000000000000000
The Quasi-Newton methods are much more efficient than any of the Conjugate-Drection methods. They take less iterations than the Conjugate-Direction methods and give etter accuracy and precision.