Lagrange Duality
If (
,...,
) denotes points in
n
-dimensional space, then a general constrained optimization problem is of the form
Optimize
f
(
,...,
)
subject to
(
,...,
) =
(
,...,
) =
: : :
(
,...,
) =
A Lagrange multiplier term is added for each constraint, so that the Lagrangian of the general problem is
L
(
,...,
,
,...,
) =
f
(
,...,
) -
(
(
,...,
) -
)
It is also possible to use inequalities as constraints, although we will not see an example of that here.
Let's look at an example:
Example 4
In example 4, we are actually taking a leap of faith in saying that the minimum of f actually occurs at that point, since it is only a critical point and is thus not necessarily an actual minimum. We might be tempted to use the second derivative test to show that the overall Lagrangian has a minimum at the solution to the Lagrange multiplier problem. However, we would find that every solution to the Lagrange multiplier problem is a saddle for the Lagrangian.
However, showing that a minimum does occur is possible if we use some very important theoretical tools related to the Lagrange multiplier problem. In particular, Lagrange Duality says that if we do the following:
1.
Minimize
a Lagrangian in terms of (
,...,
) treating the Lagrange multipliers are constants.
2. Substitute the solutions into the Lagrangian to yield a function of the Lagrange multipliers only
(this is called the dual of the original Lagrange multiplier problem).
3. Maximize the dual problem over the Lagrange multipliers
Then the result will minimize the original constrained optimization problem.
In particular, the Lagrange Saddle point theorem says that the Lagrangian has a saddle point only if the point is a solution to both the minimization of the original problem and the maximization of its dual.
Let's apply this to example 4. First, recall that its Lagrangian is as shown below:
> | Lagrangian:=x^2+y^2+z^2+lambda[1]*(3*x+4*y+6-z)+lambda[2]*(x^2+y^2-z); |
> |
In essence, we simply minimize the Lagrangian in
x, y,
and
z
while treat
ing
and
as constants.
> | grad(Lagrangian,[x,y,z]); convert(%,'set'); sols_xyz:=solve(%,{x,y,z}); |
> |
There is only one critical point. To see if we obtain a minimum at that critical point, let's compute the hessian in x, y, and z at that point:
> | hessian(Lagrangian,[x,y,z]); matsubs([sols_xyz[1],sols_xyz[2],sols_xyz[3]],%); |
> |
Since the only entries are on the diagonal, and since those entries are positive, the hessian at the critical point must be positive definite, thus implying a minimum. (That is, the "second derivative" is always positive, so that we must have "concave up" in every direction, thus implying a minimum).
Let's now substitute the values of
x
,
y
, and
z
into the Lagrangian. The result can be plotted as a function of
and
.
> | Lagrange_dual:=subs(sols_xyz,Lagrangian); plot3d(subs(lambda[1]=u,lambda[2]=v,Lagrange_dual),u=0..2,v=(1-sqrt(2*u-u^2))..(1+sqrt(2*u-u^2)),axes=normal); |
> |
It does appear that the Lagrange_dual does have a maximum. Let's use the second derivative test to locate it.
> | grad(Lagrange_dual,[lambda[1],lambda[2]]); convert(%,'set'): sols:=solve(%,{lambda[1],lambda[2]}); |
> |
Now let's apply the second derivative test to each of these solutions: We begin by using the fact that in the 2-dimensional case, the discriminant
discriminant =
is the determinant of the hessian.
> | discriminant:=det(hessian(Lagrange_dual,[lambda[1],lambda[2]])); |
> |
Let's begin at the first solution
> | print("At the solution ",sols[1]); print("Discriminant = ",subs(sols[1],discriminant)); Diff(Lagrange_Dual,lambda[1],lambda[1])=subs(sols[1],diff(Lagrange_dual,lambda[1],lambda[1])); |
> |
The second derivative test implies that there is a maximum at this first solution. The Lagrange saddle point theorem then says that substituting these values in for the solutions x, y, and z used to produce the dual results in the point with the minimum square distance to the origin:
> | print("recall that ",sols_xyz); print("the substitution",sols[1],"leads to"); subs(sols[1],sols_xyz); |
> |