Maximize subject to constraints

Maximize subject to constraints DEFAULT

Setting Up the Initial Simplex Tableau

We found in the previous section that the graphical method of solving linear programming problems, while time-consuming, enables us to see solution regions and identify corner points. This, however, is not possible when there are multiple variables. We can visualize in up to three dimensions, but even this can be difficult when there are numerous constraints.

To handle linear programming problems that contain upwards of two variables, mathematicians developed what is now known as the
simplex method. It is an efficient algorithm (set of mechanical steps) that “toggles” through corner points until it has located the one that maximizes the objective function. Although tempting, there are a few things we need to lookout for prior to using it.

Prior to providing the mathematical details, let’s see an example of a linear programming problem that
would qualify for the simplex method:

Example 1

The following system can be solved by using the simplex method:

Objective Function: P = 2x + 3y + z

Subject to Constraints:

3x + 2y ≤ 5

2x + yz ≤ 13

z ≤ 4


Standard Maximization Problem

Mathematically speaking, in order to use the simplex method to solve a linear programming problem, we need the standard maximization problem:

  • an objective function, and
  • one or more constraints of the form a1x1 + a2x2 + … anxn le V
    • All of the anumber represent real-numbered coefficients and
    • the xnumber represent the corresponding variables.
    • V is a non-negative (0 or larger) real number

Having constraints that have upper limits should make sense, since when maximizing a quantity, we probably have caps on what we can do. If we had no caps, then we could continue to increase, say profit, infinitely! This contradicts what we know about the real world.

In order to use the simplex method, either by technology or by hand, we must set up an
initial simplex tableau, which is a matrix containing information about the linear programming problem we wish to solve.

First off, matrices don’t do well with inequalities. For one, a matrix does not have a simple way of keeping track of the direction of an inequality. This alone discourages the use of inequalities in matrices. How, then, do we avoid this?

Consider the following linear programming problem


P = 7x + 12y

Subject to:

2x + 3y ≤ 6

3x + 7y ≤12

Because we know that the left sides of both inequalities will be quantities that are smaller than the corresponding values on the right, we can be sure that adding “something” to the left-hand side will make them exactly equal. That is:

2x + 3y + s1 = 6

3x + 7y + s2 = 12

For instance, suppose that
x = 1, y = 1, Then

2+3+s1=6  or s1=1

3+7+s2=12 or s2=2

It is important to note that these two variables, s1 and s2, are not necessarily the same. They simply act on the inequality by picking up the “slack” that keeps the left side from looking like the right side. Hence, we call them slack variables. This takes care of the inequalities for us. Since augmented matrices contain all variables on the left and constants on the right, we will rewrite the objective function to match this format:

–7x – 12y + P =0

We can now write an initial system of equations:

2x + 3y + s1 = 6
3x + 7y + s2 = 12
–7x – 12y + P =0

This will require us to have a matrix that can handle x,y,s1,s2, and P. We will put it in this order. Finally, the simplex method requires that the objective function be listed as the bottom line in the matrix so that we have:

There is a tableau (a matrix with the most right column and the bottom row separated from the other numbers with lines) with these numbers: Row one: 2, 3, 1, 0, 0, 6. Row two: 3, 7, 0, 1, 0, 12. Row three: negative 7, negative 12, 0, 0, 1, 0. The first five columns are annotated with a letter above them, with x above 2, y above 3, s1 above 1, s2 above 0, and P above 0.

We have established the initial simplex tableau. Note that he horizontal and vertical lines are used simply to separate constraint coefficients from constants and objective function coefficients. Also notice that the slack variable columns, along with the objective function output, form the identity matrix.

We will present the algorithm for solving, however, note that it is not entirely intuitive. We will focus on this method for one example, and will then proceed to use technology to run through the process for us.

1. Select a Pivot Column

We first select a pivot column, which will be the column that contains the largest negative coefficient in the row containing the objective function. Note that the largest negative number belongs to the term that contributes most to the objective function. This is intentional since we want to focus on values that make the output as large as possible.

Our pivot is thus the y column.

There is a tableau (a matrix with the most right column and the bottom row separated from the other numbers with lines) with these numbers: Row one: 2, 3, 1, 0, 0, 6. Row two: 3, 7, 0, 1, 0, 12. Row three: negative 7, negative 12, 0, 0, 1, 0. The first five columns are annotated with a letter above them, with x above 2, y above 3, s1 above 1, s2 above 0, and P above 0. The numbers of the second column (3, 7, negative 12) are circled to indicate they are the pivot column.

2. Select a Pivot Row

Do this by computing the ratio of each constraint constant to its respective coefficient in the pivot column—this is called the test ratio. Select the row with the smallest test ratio.

We first calculate the test ratios:

There is a tableau (a matrix with the most right column and the bottom row separated from the other numbers with lines) with these numbers: Row one: 2, 3, 1, 0, 0, 6. Row two: 3, 7, 0, 1, 0, 12. Row three: negative 7, negative 12, 0, 0, 1, 0. The first five columns are annotated with a letter above them, with x above 2, y above 3, s1 above 1, s2 above 0, and P above 0. The number 7 in row 2, column 2 has been circled.


Since the test ratio is smaller for row 2, we select it as the pivot row. The boxed value is now called our  pivot. To justify why we do this, observe that 2 and 1.7 are simply the vertical intercepts of the two inequalities. We select the smaller one to ensure we have a corner point that is in our feasible region:

3. Using Gaussian Elimination, Eliminate Rows 1 and 3

Multiply R2 by (1/7) to convert 7 to 1.

Then use the 1 to eliminate the 3 in R1:  -3R2+R1→R1

And use the 1 to eliminate the -12 in R3: 12R2+R3→R3

We get the following matrix (you might prefer it as fractions)



What have we done? For one, we have maxed out the contribution of the 2-2 entry
y-value coefficient to the objective function. Have we optimized the function? Not quite, as we still see that there is a negative value in the first column. This tells us that can still contribute to the objective function. To eliminate this, we first find the pivot row by obtaining test ratios:




Since 1.2 < 4, R1 is our new pivot row.

Interestingly, this test ratio corresponds to the input value of the intersection of the two lines!

Similarly, we proceed to eliminate all non-pivot values.

Multiply R1 by (1/0.71) to convert 0.71 to 1.

Then use the 1 to eliminate the 3 in R3:  -3R1+R2→R2

And use the 1 to eliminate the -12 in R3: 1.86R1+R3→R3

We get the following matrix



There remain no additional negative entries in the objective function row. We thus have the following matrix:


We are thus prepared to read the solutions.

4. Identify the Solution Set

To identify the solution set, we focus only on the columns with exactly one non-zero entry—these are called basic variables (columns with more than one non-zero entry are thus called nonbasic variables).

We notice that both the x any columns are basic variables. We really don’t care about the slack variables, much like we ignore inequalities when we are finding intersections. We now see that,



Setting the slack variables to 0, gives:

x ≈ 1.2

y ≈ 1.2

P = 22.8

Thus, x=1.2, y=1.2, P=22.8 is the solution to the linear programming problem. That is, inputs of x=1.2 and y=1.2 will yield a maximum objective function value of 22.8

While somewhat intuitive, this process has more behind it than we are letting on. And, rather than going through these grueling steps ad nauseum, we will allow our technology to follow these steps. For this, we need a special program, which will be distributed in class,

To perform the simplex method with a graphing calculator, the following programs are needed:

  • Pivot,
  • Pivot1, and
  • Simplex

Pivot and Pivot1 are not used directly. Instead, the Simplex program reaches into these two applications to assist it with some rather long and tedious code. How does the code work? Using instructions, it finds pivot columns, pivot rows, performs Gaussian elimination, checks for negatives in the objective function row, and repeats this process, as necessary until all negatives have been removed.

Now that we have illustrated that, in fact, the simplex method works for even two input variables, let’s show a situation where this method is particularly useful.

Example 2

A new airline has decided to join the market. It is considering offering flights out of Phoenix, AZ, and would initially like to travel to three different locations: San Diego, San Francisco, and Las Vegas. The distances of each round-trip flight going out of Phoenix are (approximately): 720 miles, 1500 miles, and 1140 miles, respectively. The company would like to use the slogan, “the average price per flight is never more than $200.” As for costs, it anticipates flights to San Diego will run about 10% of airfare. Similarly, San Francisco will run 12% and Las Vegas will run 14% of airfare. The company wants to ensure that the overall average cost is no more than 10% of earned airfare. Recent market research allows the company to conclude that it could probably sell about 1900 San Diego tickets, 700 San Francisco tickets, and 1000 Las Vegas ticket. Under these conditions and assuming that all tickets sold are round-trip flights, how much should the company charge per ticket in order to maximize its total revenue?


We want to know the price for airfare to each destination. Let,

x = price per round-trip ticket to San Diego

y = price per round-trip ticket to San Francisco

z = price per round-trip ticket to Las Vegas

The company wants to maximize total revenue. This is based on the sum of number of tickets sold multiplied by the price per ticket, which is:

R = 1900x + 700y + 1000z

Subject to the constraints:

  • Average price per flight is less than or equal to $200
  • Average cost from airfare is no more than 10% of total


  • Add prices and divide by 3
  • Or x+y+z ≤ 600
  • Revenue from San Diego tickets will total and 10% of this amount is estimated to be cost. That is  Cost = .10(1900x) = 190x. Similarly, we have .12(700y) = 84y and .14(1000z) = 140z. We want the sum of these costs to be less than or equal to 10% of total revenue, which is   .10(1900x + 700y + 1000z) = 190x + 70y + 100z.
  • 190x + 84y + 140z ≤ 190x + 70+ 100z
  •  or   14y + 40z ≤ 0

Since both constraints are of the correct form, we can proceed to set up the initial simplex tableau. As a note, be very cautious about when you use the simplex method, as unmet requirements invalidate the results obtained.

The rewritten objective function is:

–1900x – 700y – 1000z + R = 0

And simplified constraints are:

x + y + z ≤ 600 (Multiply both sides by 3)

14y + 40z ≤ 0


Each of the two constraints will require a slack variable, and so we rewrite them as follows:

x + y + z + s1 = 600

14y + 40z + s2 = 0

We will have the following variable columns:
x,y,z,s1,s2,R, and the constant column, for a total of 7 columns. We have two constraints and one objective function for a total of three rows. We now write the initial simplex tableau:


The tableau is now ready to be solved using Simplex.

Pivot on the 1st column and 1st row. (You are not allowed to divide by 0 to get the pivot row)



Since only the x column will be basic, we can see that x = 600 is a solution. The fact that y and z are nonbasic variables, we set y = z = 0. That is, they do not contribute to maximizing revenue. Additionally, R is an active variable, and so we see that R = $1,140,000 is the maximum revenue the company can earn, given the constraints. They should sell tickets to San Diego for $600 and market no flights to the other cities. As you might guess, the company is probably a bit stunned. We will explore this in the next example.

It is interesting that San Diego trips alone produce the highest revenue, based on the constraints given. Why is this? If we look at the constraints, we see that the company is fairly certain it can sell 1900 flights to San Diego. The company is also somewhat puzzled that it is expected to sell tickets at $600 each. At this point, it might decide to add some additional constraints to the model.

Example 3

Supposed the airliner in Example 2 decides that it can charge no more than $150 per ticket to San Diego in order to be competitive with other airliners that fly to the same destination. Assuming all other constraints will still be used, how are ticket prices and maximum revenue affected?


We use the same initial tableau, but we must deal with the following new constraint:

x ≤ 150

Adding a third slack variable, we have

x + s3 = 150

This adds one column and one row to our tableau:


Solving this by Simplex yields


Solution: x=150, y=0, z=450 and R=285000

Example 4

A catering company is to make lunch for a business meeting. It will serve ham sandwiches, light ham sandwiches, and vegetarian sandwiches. A ham sandwich has 1 serving of vegetables, 4 slices of ham, 1 slice of cheese, and 2 slices of bread. A light ham sandwich has 2 serving of vegetables, 2 slices of ham, 1 slice of cheese and 2 slices of bread. A vegetarian sandwich has 3 servings of vegetables, 2 slices of cheese, and 2 slices of bread. A total of 10 bags of ham are available, each of which has 40 slices; 18 loaves of bread are available, each with 14 slices; 200 servings of vegetables are available, and 15 bags of cheese, each with 60 slices, are available. Given the resources, how many of each sandwich can be produced if the goal is to maximize the number of sandwiches?


We wish to maximize the number of sandwiches, so let:

x = # of ham sandwiches

y = # of light ham sandwiches

z = # of vegetarian sandwiches

The total number of sandwiches is given by

S = x + y + z

The constraints will be given by considering the total amount of ingredients available. That is, the company has a limited amount of ham, vegetables, cheese, and bread.

In total, the company has
10(40) = 400 slices of ham, 18(14) = 252 slices of bread, 200 servings of vegetables, and 15(60) = 900 slices of cheese available. At most, the company can use the above amounts.

There are two sandwiches that use ham—the first requires 4 slices of ham and the second requires only 2, per sandwich. That is, 4x + 2y ≤ 400

That is, the total number of slices of ham cannot exceed 400.

Each sandwich requires 2 slices of bread so 2x + 2y + 2z ≤ 252

The ham sandwiches have 1 and 2 servings of vegetables, respectively, and the vegetarian sandwich has 3 servings of vegetables. So, 1x + 2y + 3z ≤ 200

Both ham sandwiches require one slice of cheese, and the vegetarian sandwich requires two slices of cheese, so, 1x + 1y + 2z ≤ 900   Below is the completed linear programming model for this example.

Maximize: S = x + y + z
Subject To: 4x + 2y ≤ 400
                     2x + 2y + 2z ≤ 252
                    x + 2y + 3z ≤ 200
                   1x + 1y + 2z ≤ 900
                    x, y, z≥0

These constraints satisfy the requirements for the simplex method, so we proceed.

Incorporating slack variables, we get:

4x + 2y + 0z + s1 = 400

2x + 2y + 2z + s2 = 252

x + 2y + 3z + s3 = 200

x + y + 2z + s4 = 900

xyz + S = 0

The initial simplex tableau is:



Since the most negative number on the bottom row is the same for the 3 columns, we can use either column. Might as well use the first column. The smallest quotient occurs by dividing 4 into 400 so row 1 is the pivot column. Pivoting on the “4” in R1C1 yields.


Note: We have increased S from 0 to 200, but we still have a negative in the bottom row. Since “-1” is more negative than “-1/2” we will pivot on column 3. After dividing the positive numbers above “-1” into the constants, we get the smallest quotient in row 2. Pivoting on the “2” in R2C3 yields.


We now have the optimal solution

  • x=100 (basic variable in row 1)
  • y=0 (nonbasic variable)
  • z=26 (basic variable row 2)
  • s1=0 (nonbasic variable)
  • s2=0 (nonbasic variable)
  • s3=22 (basic variable row 3)
  • s4=748 (basic variable row 4)
  • S=126 (basic variable row 5)

Of course we are really just interested in: x=100, y=0, z=26, S=126

We find that 100 ham sandwiches, 26 vegetarian sandwiches, and 0 light ham sandwiches should be made to maximize the total number of sandwiches made.

The slack variables are not important in the solution. Just in reaching the solution.

See next section for more examples. Many problems use subscripted variables such as x1, x2, x3, etc.  This is especially helpful if you have several variables. You will see this in later examples.

More resources are listed below:

Milos Podmanik, By the Numbers, “Solving Standard Maximization Problems using the Simplex Method,” licensed under a CC BY-NC-SA 3.0 license.

MathIsGreatFun, “MAT217 HW 2.2 #1,” licensed under a Standard YouTube license.

MathIsGreatFun, “MAT217 2.2 #2,” licensed under a Standard YouTube license.

MathIsGreatFun, “MAT217 2.2 #3,” licensed under a Standard YouTube license.

MathIsGreatFun, “MATH217 2.2 #4,” licensed under a Standard YouTube license.


Maximizing a function subject to a constraint

I need help to find the maximum ($x^∗, y^∗, z^∗$) of the production function $Q(x, y, z) = x^{1/4}y^{1/4}z^{1/4}$

subject to the budget constraint $h(x, y, z) = ax+by +cz −d = 0$, (where $a, b, c, d$ are positive constants), in terms of these constants. And from this, I must find an expression for the maximum value $Q^∗$ of the budget in terms of $a, b, c, d$ and the corresponding value $λ^∗$ of the Lagrange multiplier.

So far in this question I have set up a system of linear equations from the partial derivatives of the Lagrangian form:

$L_x = \dfrac{1}{4}y^{1/4}z^{1/4}y^{-3/4} - a\lambda = 0$

$L_y = \dfrac{1}{4}x^{1/4}z^{1/4}x^{-3/4} - b\lambda = 0$

$L_z = \dfrac{1}{4}x^{1/4}y^{1/4}z^{-3/4} - c\lambda = 0$

$-L_\lambda = ax + by +cz -d = 0$

From here I have trouble solving these equations and taking the question to the next step.

asked Nov 27 '19 at 12:59


61333 silver badges1616 bronze badges

  1. Shades of meaning video
  2. 2020 flawless football cards
  3. Electric fan thermostat switch
  4. 1984 f250 lift kit
  5. Rare translate to spanish

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains * and * are unblocked.

Current time:0:00Total duration:6:29

Lagrange multipliers and constrained optimization

Video transcript

1 would be achievable and in fact you know if we kind of go back to that and we look at 0.1 if I upped that value and you know changed it to the line where instead what you're looking at is 0.2 that's also possible because it intersects with the circle in fact you know you could play around with it and increase it a little bit more and if I go to 0.3 instead and I go over here and I say 0.3 that's also possible and what we're basically trying to do is find the maximum value that we can put here the maximum value so that if we look at the line that represents f of X y equals that value it still intersects with the circle and the key here the key observation is that that maximum value happens when these guys are tangent and in the next video I'll start going into the details of how we can use that observation this notion of tangency to solve the problem to find the actual value of x and why it maximizes this subject to the constraint but in the interim I kind of want you to mole on that and think a little bit about how you might use that what is tangency mean here how can you take advantage of certain other notions that we've learned about in multivariable calculus like hint hint the gradient to actually solve something like this so with that I will see you next video

2.7: Constrained Optimization - Lagrange Multipliers

  1. Last updated
  2. Save as PDF

In Sections 2.5 and 2.6 we were concerned with finding maxima and minima of functions without any constraints on the variables (other than being in the domain of the function). What would we do if there were constraints on the variables? The following example illustrates a simple case of this type of problem.

Example 2.24

For a rectangle whose perimeter is 20 m, find the dimensions that will maximize the area.


The area \(A\) of a rectangle with width \(x\) and height \(y\) is \(A = x y\). The perimeter \(P\) of the rectangle is then given by the formula \(P = 2x+2y\). Since we are given that the perimeter \(P = 20\), this problem can be stated as:

\[\nonumber \begin{align}\text{Maximize : }&f (x, y) = x y \\[4pt] \nonumber \text{given : }&2x+2y = 20 \end{align}\]

The reader is probably familiar with a simple method, using single-variable calculus, for solving this problem. Since we must have \(2x + 2y = 20\), then we can solve for, say, \(y\) in terms of \(x\) using that equation. This gives \(y = 10− x\), which we then substitute into \(f\) to get \(f (x, y) = x y = x(10 − x) = 10x − x^2\). This is now a function of \(x\) alone, so we now just have to maximize the function \(f (x) = 10x− x^2\) on the interval [0,10]. Since \(f ′ (x) = 10−2x = 0 \Rightarrow x = 5 \text{ and }f ′′(5) = −2 < 0\), then the Second Derivative Test tells us that \(x = 5\) is a local maximum for \(f\), and hence \(x = 5\) must be the global maximum on the interval [0,10] (since \(f = 0\) at the endpoints of the interval). So since \(y = 10 − x = 5\), then the maximum area occurs for a rectangle whose width and height both are 5 m.

Notice in the above example that the ease of the solution depended on being able to solve for one variable in terms of the other in the equation \(2x+2y = 20\). But what if that were not possible (which is often the case)? In this section we will use a general method, called the Lagrange multiplier method, for solving constrained optimization problems:

\[\nonumber \begin{align} \text{Maximize (or minimize) : }&f (x, y)\quad (\text{or }f (x, y, z)) \\[4pt] \nonumber \text{given : }&g(x, y) = c \quad (\text{or }g(x, y, z) = c) \text{ for some constant } c \end{align}\]

The equation \(g(x, y) = c\) is called the constraint equation, and we say that \(x\) and \(y\) are constrained by \(g(x, y) = c\). Points \((x, y)\) which are maxima or minima of \(f (x, y)\) with the condition that they satisfy the constraint equation \(g(x, y) = c\) are called constrained maximum or constrained minimum points, respectively. Similar definitions hold for functions of three variables.

The Lagrange multiplier method for solving such problems can now be stated:

Theorem 2.7: The Lagrange Multiplier Method

Let \(f (x, y)\text{ and }g(x, y)\) be smooth functions, and suppose that \(c\) is a scalar constant such that \(\nabla g(x, y) \neq \textbf{0}\) for all \((x, y)\) that satisfy the equation \(g(x, y) = c\). Then to solve the constrained optimization problem

\[\nonumber \begin{align} \text{Maximize (or minimize) : }&f (x, y) \\[4pt] \nonumber \text{given : }&g(x, y) = c ,\end{align}\]

find the points \((x, y)\) that solve the equation \(\nabla f (x, y) = \lambda \nabla g(x, y)\) for some constant \(\lambda\) (the number \(\lambda\) is called the Lagrange multiplier). If there is a constrained maximum or minimum, then it must be such a point.

A rigorous proof of the above theorem requires use of the Implicit Function Theorem, which is beyond the scope of this text. Note that the theorem only gives a necessary condition for a point to be a constrained maximum or minimum. Whether a point \((x, y)\) that satisfies \(\nabla f (x, y) = \lambda \nabla g(x, y)\) for some \(\lambda\) actually is a constrained maximum or minimum can sometimes be determined by the nature of the problem itself. For instance, in Example 2.24 it was clear that there had to be a global maximum.

So how can you tell when a point that satisfies the condition in Theorem 2.7 really is a constrained maximum or minimum? The answer is that it depends on the constraint function \(g(x, y)\), together with any implicit constraints. It can be shown that if the constraint equation \(g(x, y) = c\) (plus any hidden constraints) describes a bounded set \(B\) in \(\mathbb{R}^2\), then the constrained maximum or minimum of \(f (x, y)\) will occur either at a point \((x, y)\) satisfying \(\nabla f (x, y) = \lambda \nabla g(x, y)\) or at a “boundary” point of the set \(B\).

In Example 2.24 the constraint equation \(2x+2y = 20\) describes a line in \(\mathbb{R}^2\), which by itself is not bounded. However, there are “hidden” constraints, due to the nature of the problem, namely \(0 ≤ x, y ≤ 10\), which cause that line to be restricted to a line segment in \(\mathbb{R}^2\) (including the endpoints of that line segment), which is bounded.

Example 2.25

For a rectangle whose perimeter is 20 m, use the Lagrange multiplier method to find the dimensions that will maximize the area.


As we saw in Example 2.24, with \(x\) and \(y\) representing the width and height, respectively, of the rectangle, this problem can be stated as:

\[\nonumber \begin{align} \text{Maximize : }&f (x, y) = x y \\[4pt] \nonumber \text{given : }&g(x, y) = 2x+2y = 20 \end{align}\]

Then solving the equation \(\nabla f (x, y) = \lambda \nabla g(x, y)\) for some \(\lambda\) means solving the equations \(\dfrac{∂f}{∂x} = \lambda \dfrac{∂g}{∂x}\text{ and }\dfrac{∂f}{∂y} = \lambda \dfrac{∂g}{∂y}\), namely:

\[\nonumber \begin{align} y &=2\lambda ,\\[4pt] \nonumber x &=2\lambda \end{align}\]

The general idea is to solve for \(\lambda\) in both equations, then set those expressions equal (since they both equal \(\lambda\)) to solve for \(x \text{ and }y\). Doing this we get

\[\nonumber \dfrac{y}{2} = \lambda = \dfrac{x}{2} \Rightarrow x = y ,\]

so now substitute either of the expressions for \(x \text{ or }y\) into the constraint equation to solve for \(x \text{ and }y\):

\[\nonumber 20 = g(x, y) = 2x+2y = 2x+2x = 4x \quad \Rightarrow \quad x = 5 \quad \Rightarrow \quad y = 5\]

There must be a maximum area, since the minimum area is 0 and \(f (5,5) = 25 > 0\), so the point \((5,5)\) that we found (called a constrained critical point) must be the constrained maximum.

\(\therefore\) The maximum area occurs for a rectangle whose width and height both are 5 m.

Example 2.26

Find the points on the circle \(x^2 + y^2 = 80\) which are closest to and farthest from the point \((1,2)\).


The distance \(d\) from any point \((x, y)\) to the point \((1,2)\) is

\[\nonumber d = \sqrt{ (x−1)^2 +(y−2)^2} ,\]

and minimizing the distance is equivalent to minimizing the square of the distance. Thus the problem can be stated as:

\[\nonumber \begin{align}\text{Maximize (and minimize) : }&f (x, y) = (x−1)^2 +(y−2)^2 \\[4pt] \nonumber \text{given : }&g(x, y) = x^2 + y^2 = 80 \end{align} \]

Solving \(\nabla f (x, y) = \lambda \nabla g(x, y)\) means solving the following equations:

\[\nonumber \begin{align}2(x−1) &= 2\lambda x , \\[4pt] \nonumber 2(y−2) &= 2\lambda y \end{align} \]

Note that \(x \neq 0\) since otherwise we would get −2 = 0 in the first equation. Similarly, \(y \neq 0\). So we can solve both equations for \(\lambda\) as follows:

\[\nonumber \dfrac{x−1}{x} = \lambda = \dfrac{y−2}{y} \Rightarrow x y− y = x y−2x \quad \Rightarrow \quad y = 2x\]

Substituting this into \(g(x, y) = x^2 + y^2 = 80\) yields \(5x^2 = 80\), so \(x = \pm 4\). So the two constrained critical points are \((4,8)\text{ and }(−4,−8)\). Since \(f (4,8) = 45 \text{ and }f (−4,−8) = 125\), and since there must be points on the circle closest to and farthest from \((1,2)\), then it must be the case that \((4,8)\) is the point on the circle closest to \((1,2)\text{ and }(−4,−8)\) is the farthest from \((1,2)\) (see Figure 2.7.1).


Notice that since the constraint equation \(x^2+y^2 = 80\) describes a circle, which is a bounded set in \(\mathbb{R}^2\), then we were guaranteed that the constrained critical points we found were indeed the constrained maximum and minimum.

The Lagrange multiplier method can be extended to functions of three variables.

Example 2.27

\[\nonumber \begin{align} \text{Maximize (and minimize) : }&f (x, y, z) = x+ z \\[4pt] \nonumber \text{given : }&g(x, y, z) = x^2 + y^2 + z^2 = 1 \end{align}\]


Solve the equation \(\nabla f (x, y, z) = \lambda \nabla g(x, y, z)\):

\[\nonumber \begin{align} 1 &= 2\lambda x \\[4pt] 0 &= 2\lambda y \\[4pt] \nonumber 1 &= 2\lambda z \end{align}\]

The first equation implies \(\lambda \neq 0\) (otherwise we would have 1 = 0), so we can divide by \(\lambda\) in the second equation to get \(y = 0\) and we can divide by \(\lambda\) in the first and third equations to get \(x = \dfrac{1}{2\lambda} = z\). Substituting these expressions into the constraint equation \(g(x, y, z) = x^2 + y^2 + z^2 = 1\) yields the constrained critical points \(\left (\dfrac{1}{\sqrt{2}},0,\dfrac{1}{\sqrt{2}} \right )\) and \(\left ( \dfrac{−1}{\sqrt{2}} ,0,\dfrac{ −1}{\sqrt{2}}\right )\). Since \(f \left ( \dfrac{1}{\sqrt{2}} ,0,\dfrac{ 1}{\sqrt{2}}\right ) > f \left ( \dfrac{−1}{\sqrt{2}} ,0,\dfrac{ −1}{\sqrt{2}}\right )\), and since the constraint equation \(x^2 + y^2 + z^2 = 1\) describes a sphere (which is bounded) in \(\mathbb{R}^ 3\), then \(\left ( \dfrac{1}{\sqrt{2}} ,0,\dfrac{ 1}{\sqrt{2}}\right )\) is the constrained maximum point and \(\left ( \dfrac{−1}{\sqrt{2}} ,0,\dfrac{ −1}{\sqrt{2}}\right )\) is the constrained minimum point.

So far we have not attached any significance to the value of the Lagrange multiplier \(\lambda\). We needed \(\lambda\) only to find the constrained critical points, but made no use of its value. It turns out that \(\lambda\) gives an approximation of the change in the value of the function \(f (x, y)\) that we wish to maximize or minimize, when the constant c in the constraint equation \(g(x, y) = c\) is changed by 1.

For example, in Example 2.25 we showed that the constrained optimization problem

\[\nonumber \begin{align}\text{Maximize : }&f (x, y) = x y \\[4pt] \nonumber \text{given : }&g(x, y) = 2x+2y = 20 \end{align}\]

had the solution \((x, y) = (5,5)\), and that \(\lambda = \dfrac{x}{2} = \dfrac{y}{2}\). Thus, \(\lambda = 2.5\). In a similar fashion we could show that the constrained optimization problem

\[\nonumber \begin{align} \text{Maximize : }&f (x, y) = x y \\[4pt] \nonumber \text{given : }&g(x, y) = 2x+2y = 21 \end{align}\]

has the solution \((x, y) = (5.25,5.25)\). So we see that the value of \(f (x, y)\) at the constrained maximum increased from \(f (5,5) = 25 \text{ to }f (5.25,5.25) = 27.5625\), i.e. it increased by 2.5625 when we increased the value of \(c\) in the constraint equation \(g(x, y) = c \text{ from }c = 20 \text{ to }c = 21\). Notice that \(\lambda = 2.5\) is close to 2.5625, that is,

\[\nonumber \lambda \approx \nabla f=f (\text{new max. pt})− f (\text{old max. pt})\]

Finally, note that solving the equation \(\nabla f (x, y) = \lambda \nabla g(x, y)\) means having to solve a system of two (possibly nonlinear) equations in three unknowns, which as we have seen before, may not be possible to do. And the 3-variable case can get even more complicated. All of this somewhat restricts the usefulness of Lagrange’s method to relatively simple functions. Luckily there are many numerical methods for solving constrained optimization problems, though we will not discuss them here.

Contributors and Attributions


Constraints to maximize subject

Introduction to Constrained Optimization in the Wolfram Language

Optimization Problems

Constrained optimization problems are problems for which a function is to be minimized or maximized subject to constraints . Here is called the objective function and is a Boolean-valued formula. In the Wolfram Language the constraints can be an arbitrary Boolean combination of equations , weak inequalities , strict inequalities , and statements. The following notation will be used.

stands for "minimize subject to constraints ", and

stands for "maximize subject to constraints ".

You say a point satisfies the constraints if is true.

The following describes constrained optimization problems more precisely, restricting the discussion to minimization problems for brevity.

Global Optimization

A point is said to be a global minimum of subject to constraints if satisfies the constraints and for any point that satisfies the constraints, .

A value is said to be the global minimum value of subject to constraints if for any point that satisfies the constraints, .

The global minimum value exists for any and . The global minimum value is attained if there exists a point such that is true and . Such a point is necessarily a global minimum.

If is a continuous function and the set of points satisfying the constraints is compact (closed and bounded) and nonempty, then a global minimum exists. Otherwise a global minimum may or may not exist.

Here the minimum value is not attained. The set of points satisfying the constraints is not closed:

Here the set of points satisfying the constraints is closed but unbounded. Again, the minimum value is not attained:

The minimum value may be attained even if the set of points satisfying the constraints is neither closed nor bounded:

Local Optimization

A point is said to be a local minimum of subject to constraints if satisfies the constraints and, for some , if satisfies , then .

A local minimum may not be a global minimum. A global minimum is always a local minimum.

Here FindMinimum finds a local minimum that is not a global minimum:

Solving Optimization Problems

The methods used to solve local and global optimization problems depend on specific problem types. Optimization problems can be categorized according to several criteria. Depending on the type of functions involved there are linear and nonlinear (polynomial, algebraic, transcendental, ...) optimization problems. If the constraints involve , you have integer and mixed integer-real optimization problems. Additionally, optimization algorithms can be divided into numeric and symbolic (exact) algorithms.

Wolfram Language functions for constrained optimization include Minimize, Maximize, NMinimize, and NMaximize for global constrained optimization, FindMinimum for local constrained optimization, and LinearProgramming for efficient and direct access to linear programming methods. The following table briefly summarizes each of the functions.

FindMinimum,FindMaximumnumeric local optimizationlinear programming methods, nonlinear interior point algorithms, utilize second derivatives
NMinimize,NMaximizenumeric global optimizationlinear programming methods, Nelder-Mead, differential evolution, simulated annealing, random search
Minimize,Maximizeexact global optimizationlinear programming methods, cylindrical algebraic decomposition, Lagrange multipliers and other analytic methods, integer linear programming
LinearProgramminglinear optimizationlinear programming methods (simplex, revised simplex, interior point)

Summary of constrained optimization functions.

Here is a decision tree to help in deciding which optimization function to use.


Lagrange Multipliers

Constrained optimization

In mathematical optimization, constrained optimization (in some contexts called constraint optimization) is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.

Relation to constraint-satisfaction problems[edit]

The constrained-optimization problem (COP) is a significant generalization of the classic constraint-satisfaction problem (CSP) model.[1] COP is a CSP that includes an objective function to be optimized. Many algorithms are used to handle the optimization part.

General form[edit]

A general constrained minimization problem may be written as follows:[2]

{\begin{array}{rcll}\min &~&f({\mathbf  {x}})&\\{\mathrm  {subject~to}}&~&g_{i}({\mathbf  {x}})=c_{i}&{\text{for }}i=1,\ldots ,n\quad {\text{Equality constraints}}\\&~&h_{j}({\mathbf  {x}})\geqq d_{j}&{\text{for }}j=1,\ldots ,m\quad {\text{Inequality constraints}}\end{array}}

where g_{i}({\mathbf  {x}})=c_{i}~{\mathrm  {for~}}i=1,\ldots ,n and h_{j}({\mathbf  {x}})\geq d_{j}~{\mathrm  {for~}}j=1,\ldots ,m are constraints that are required to be satisfied (these are called hard constraints), and f(\mathbf {x} ) is the objective function that needs to be optimized subject to the constraints.

In some problems, often called constraint optimization problems, the objective function is actually the sum of cost functions, each of which penalizes the extent (if any) to which a soft constraint (a constraint which is preferred but not required to be satisfied) is violated.

Solution methods[edit]

Many constrained optimization algorithms can be adapted to the unconstrained case, often via the use of a penalty method. However, search steps taken by the unconstrained method may be unacceptable for the constrained problem, leading to a lack of convergence. This is referred to as the Maratos effect.[3]

Equality constraints[edit]

Substitution method[edit]

For very simple problems, say a function of two variables subject to a single equality constraint, it is most practical to apply the method of substitution.[4] The idea is to substitute the constraint into the objective function to create a composite function that incorporates the effect of the constraint. For example, assume the objective is to maximize {\displaystyle f(x,y)=x\cdot y} subject to {\displaystyle x+y=10}. The constraint implies {\displaystyle y=10-x}, which can be substituted into the objective function to create {\displaystyle p(x)=x(10-x)=10x-x^{2}}. The first-order necessary condition gives {\displaystyle {\frac {\partial p}{\partial x}}=10-2x=0}, which can be solved for x=5 and, consequently, {\displaystyle y=10-5=5}.

Lagrange multiplier[edit]

Main article: Lagrange multipliers

If the constrained problem has only equality constraints, the method of Lagrange multipliers can be used to convert it into an unconstrained problem whose number of variables is the original number of variables plus the original number of equality constraints. Alternatively, if the constraints are all equality constraints and are all linear, they can be solved for some of the variables in terms of the others, and the former can be substituted out of the objective function, leaving an unconstrained problem in a smaller number of variables.

Inequality constraints[edit]

With inequality constraints, the problem can be characterized in terms of the geometric optimality conditions, Fritz John conditions and Karush–Kuhn–Tucker conditions, under which simple problems may be solvable.

Linear programming[edit]

If the objective function and all of the hard constraints are linear and some hard constraints are inequalities, then the problem is a linear programming problem. This can be solved by the simplex method, which usually works in polynomial time in the problem size but is not guaranteed to, or by interior point methods which are guaranteed to work in polynomial time.

Nonlinear programming[edit]

If the objective function or some of the constraints are nonlinear, and some constraints are inequalities, then the problem is a nonlinear programming problem.

Quadratic programming[edit]

If all the hard constraints are linear and some are inequalities, but the objective function is quadratic, the problem is a quadratic programming problem. It is one type of nonlinear programming. It can still be solved in polynomial time by the ellipsoid method if the objective function is convex; otherwise the problem may be NP hard.

KKT conditions[edit]

Allowing inequality constraints, the KKT approach to nonlinear programming generalizes the method of Lagrange multipliers. It can be applied under differentiability and convexity.

Branch and bound[edit]

Constraint optimization can be solved by branch-and-bound algorithms. These are backtracking algorithms storing the cost of the best solution found during execution and using it to avoid part of the search. More precisely, whenever the algorithm encounters a partial solution that cannot be extended to form a solution of better cost than the stored best cost, the algorithm backtracks, instead of trying to extend this solution.

Assuming that cost is to be minimized, the efficiency of these algorithms depends on how the cost that can be obtained from extending a partial solution is evaluated. Indeed, if the algorithm can backtrack from a partial solution, part of the search is skipped. The lower the estimated cost, the better the algorithm, as a lower estimated cost is more likely to be lower than the best cost of solution found so far.

On the other hand, this estimated cost cannot be lower than the effective cost that can be obtained by extending the solution, as otherwise the algorithm could backtrack while a solution better than the best found so far exists. As a result, the algorithm requires an upper bound on the cost that can be obtained from extending a partial solution, and this upper bound should be as small as possible.

A variation of this approach called Hansen's method uses interval methods.[5] It inherently implements rectangular constraints.

First-choice bounding functions[edit]

One way for evaluating this upper bound for a partial solution is to consider each soft constraint separately. For each soft constraint, the maximal possible value for any assignment to the unassigned variables is assumed. The sum of these values is an upper bound because the soft constraints cannot assume a higher value. It is exact because the maximal values of soft constraints may derive from different evaluations: a soft constraint may be maximal for x=a while another constraint is maximal for {\displaystyle x=b}.

Russian doll search[edit]

This method[6] runs a branch-and-bound algorithm on n problems, where n is the number of variables. Each such problem is the subproblem obtained by dropping a sequence of variables x_{1},\ldots ,x_{i} from the original problem, along with the constraints containing them. After the problem on variables x_{{i+1}},\ldots ,x_{n} is solved, its optimal cost can be used as an upper bound while solving the other problems,

In particular, the cost estimate of a solution having x_{{i+1}},\ldots ,x_{n} as unassigned variables is added to the cost that derives from the evaluated variables. Virtually, this corresponds on ignoring the evaluated variables and solving the problem on the unassigned ones, except that the latter problem has already been solved. More precisely, the cost of soft constraints containing both assigned and unassigned variables is estimated as above (or using an arbitrary other method); the cost of soft constraints containing only unassigned variables is instead estimated using the optimal solution of the corresponding problem, which is already known at this point.

There is similarity between the Russian Doll Search method and dynamic programming. Like dynamic programming, Russian Doll Search solves sub-problems in order to solve the whole problem. But, whereas Dynamic Programming directly combines the results obtained on sub-problems to get the result of the whole problem, Russian Doll Search only uses them as bounds during its search.

Bucket elimination[edit]

The bucket elimination algorithm can be adapted for constraint optimization. A given variable can be indeed removed from the problem by replacing all soft constraints containing it with a new soft constraint. The cost of this new constraint is computed assuming a maximal value for every value of the removed variable. Formally, if x is the variable to be removed, C_{1},\ldots ,C_{n} are the soft constraints containing it, and y_{1},\ldots ,y_{m} are their variables except x, the new soft constraint is defined by:

{\displaystyle C(y_{1}=a_{1},\ldots ,y_{n}=a_{n})=\max _{a}\sum _{i}C_{i}(x=a,y_{1}=a_{1},\ldots ,y_{n}=a_{n}).}

Bucket elimination works with an (arbitrary) ordering of the variables. Every variable is associated a bucket of constraints; the bucket of a variable contains all constraints having the variable has the highest in the order. Bucket elimination proceed from the last variable to the first. For each variable, all constraints of the bucket are replaced as above to remove the variable. The resulting constraint is then placed in the appropriate bucket.

See also[edit]


  1. ^Rossi, Francesca; van Beek, Peter; Walsh, Toby (2006-01-01), Rossi, Francesca; van Beek, Peter; Walsh, Toby (eds.), "Chapter 1 – Introduction", Foundations of Artificial Intelligence, Handbook of Constraint Programming, Elsevier, 2, pp. 3–12, doi:10.1016/s1574-6526(06)80005-2, retrieved 2019-10-04
  2. ^Martins, J. R. R. A.; Ning, A. (2021). Engineering Design Optimization. Cambridge University Press. ISBN .
  3. ^Wenyu Sun; Ya-Xiang Yua (2010). Optimization Theory and Methods: Nonlinear Programming, Springer, ISBN 978-1441937650. p. 541
  4. ^Prosser, Mike (1993). "Constrained Optimization by Substitution". Basic Mathematics for Economists. New York: Routledge. pp. 338–346. ISBN .
  5. ^Leader, Jeffery J. (2004). Numerical Analysis and Scientific Computation. Addison Wesley. ISBN .
  6. ^Verfaillie, Gérard, Michel Lemaître, and Thomas Schiex. "Russian doll search for solving constraint optimization problems." AAAI/IAAI, Vol. 1. 1996.

Further reading[edit]


Similar news:

She hoped to bring him to his senses with water, but the process was long and tedious. Sonya Marmeladova, meanwhile, was bringing her wards. Up to date and acquainting them with their new job responsibilities.

38191 38192 38193 38194 38195