next up previous contents
Next: Nonserial Dynamic Programming Up: Hunters and Gatherers Previous: Other Strategies for

Using Linear Programming for Constraint Satisfaction Problems

Linear programming (LP) techniques have been around since the 1940's when G.B. Dantzig designed the so-called ``simplex method'' for solving linear planning problems for the U.S. Air Force. When such techniques work, they provide extremely fast answers to optimization problems. It is tempting to try and apply such methods to computational semantics. Before discussing why LP cannot provide reliable answers for computational semantics (and similar problems), it will be instructive to give a short, elementary overview of the techniques used. (Chvátal, 1983) is an excellent introduction to linear programming.

The central idea used in LP stems from the algebra technique known as ``Gaussian elimination,'' used for solving systems of equations. Given the following two equations:

we can easily solve for the value of x in the second equation and then substitute it back into the first equation, which will then allow us to solve for y. By substituting this value for y back into one of the original equations, we can then solve for x.

The simplex method for optimizing an equation subject to constraints extends the basic gaussian elimination method. A short example follows. We make no attempt to explain the rationale behind the methodology here; the interested reader may refer to (Chvátal, 1983). Given the following problem:

The simplex method starts by creating ``slack'' variables for each of the inequalities such that the slack variable must be . The variable z is then set to the equation to be maximized:

   

The simplex method works by starting out with a feasible solution, and then successively improving that solution. A feasible first solution in this case is . Substituting this in for and gives:

The key to the simplex method, then, is to pick one of the variables in equation 3 to increase so that z will increase. In this case, increasing either or will increase z, so we will pick . Since we will keep , equations 1 and 2 can give us an upper bound on how much we can increase . Equation 1 gives us , since , and equation 2 gives us , since . The latter is more constraining, so we will adopt it as the starting point for the next intermediate solution. Substituting into equations 1, 2 and 3 gives us:

The fact that the value of z increased confirms that this solution is better than the last one. We now wish to find an even better solution. The ``trick'' of the simplex method is to, after each intermediate solution, express the equations for variables with positive values in terms of those variables with 0 values. We can express in terms of and (the variables with 0 values) by solving equation 2 for . We can then express and z in terms of and by substituting this equation for into equations 1 and 3:

   

Again, we try to pick one of the variables in 6, which always will have values of 0 in the current solution, to increase so that the value of z will increase. In this case, increasing decreases z, so we don't want to try that. Increasing , though, will increase z. Again, by holding , we can read off the possible values for from equations 4 and 5. Equation 4 gives us and equation 5 gives . The former is more restrictive, so we substitute it into the equations giving:

To start another iteration, we again need to express the non-zero variables in terms of the zero-valued variables. in terms of and can be obtained from equation 4, and then we can substitute this into equation 6:

 

At this point, we again try to pick a variable from equation 8 that, when increased, will increase the value of z. In this case, however, increasing either or will decrease the value of z. This is when we know that we are finished. The correct solution to the original problem, then, is:

The simplex method for optimizing systems of linear equations is extremely powerful. There are, however, several problems.

  1. Theory-internal problems.
    1. Initialization. The process described above assumed a feasible initialization (). in practice, the initialization values may be difficult or impossible to find.
    2. Looping. It is possible to create a set of iterations that loop.
    3. Termination. Because of the looping problem, the simplex method may not terminate for some problems. In addition, some problems, although the answer might be found eventually, may require an exponential number of iterations.
  2. Theory-external problems.
    1. Non-linear. The simplex method requires linear mathematical formulas. Many interesting real-world problems, including problems in computational semantics, are not linear.
    2. Non-decomposable. For complex problems, it is advantageous to break the original problem into small subproblems. Typically, only a subset of these subproblems has will have excessive complexity. An attractive paradigm for solving such problems would be to have a heuristic problem-solver work on the subproblems that are too complex for non-heuristic methods, and then integrate those solutions into the overall problem solution. LP methods do not fit well into this paradigm.
    3. Non-dynamic. If, after spending 30 minutes optimizing a complex series of linear equations, a single coefficient on one of the equations was changed, the whole process would have to be repeated. LP methods are not dynamic, and minimum perturbation replanning is not possible.

The theory-internal problems will not be expounded upon further here. Various techniques have been created to minimize, or in some cases, eliminate their effects (Chvátal, 1983).

The problem of non-linear inputs is severe. Many real-world problems cannot be expressed by a system of linear equations. Many problems cannot be expressed with mathematical formulas at all. Semantic selectional constraints (see section 4) fit into this category. Figure 32, the example semantic analysis problem solved by Hunter-Gatherer in section 5 is previewed here in Figure 9.

  
Figure 9: Constraint Dependencies in Sample Sentence

The constraints between adquirir and Dr-Andreu, for this simplified example, can be represented as Table 1.

  
Table 1: Semantic constraints

The function, , which maps from word sense combinations into an evaluation score is not linear. In the case of semantic selection restrictions, the scores correspond to a numerical rating of the ability of the one sense to fill a semantic role of the other (see section 4 for more details). The example in Figure 9 is very simplified. Typically, the function table contains two variables, each of which may contain up to 8 possible values (word senses). Furthermore, semantic constraints do not have to be binary. Tables with three or more variables might be used in certain cases. LP methods simply cannot handle these types of input functions.

The Hunter-Gatherer control architecture is based, in part, on decomposing the input problem into simpler subproblems which can be solved separately and then ``synthesized'' together to obtain the answer for the whole problem. As discussed in section 9, this methodology takes advantage of the topology of the input problem, separating ``easier'' sections of the problem from potentially ``harder'' ones. One of the great strengths of HG is its ability to accept solutions to any subproblem from any source. If the particular subproblem has a complexity that is too high for non-heuristic methods such as HG to solve, it can be ``subcontracted'' to other problem solvers. Furthermore, HG can identify these ``hard'' areas ahead of time, enabling a collaborative approach between heuristic and non-heuristic methods. Unfortunately, this problem solving approach does not even make sense for LP. It is possible for LP techniques to be used as one of the subcontractors for a particular subpart of the problem, but LP cannot divide up a problem for itself. Neither can it decide ahead of time whether a given problem can be solved in a given amount of time. Assuming a problem is even of the type that can be solved using LP, these are the greatest drawbacks of the approach: one cannot determine ahead of time if the problem can be solved, nor can LP avail itself of other problem solvers to help with difficult parts of the problem.

The other serious limitation of LP is that it is a static problem solver. The methods solve a given system of equations. Change any of the inputs and the whole problem needs to be solved again. Hunter-Gatherer, on the other hand, is a constraint-based control architecture. Interactions between variables are tracked, and problem decomposition is based on these interactions. If any inputs are changed, the extent of their impact can determined. Minimal perturbation replanning can then be accomplished by re-examining only those parts of the problem that are affected. In fact, future research is aimed at making Hunter-Gatherer functionally equivalent to a blackboard architecture. HG will be able to dynamically add values to the domain of a variable (for semantic analysis this is equivalent to adding new word senses on the fly in order to, for example, process figurative language), add in new constraint links between two variables (for instance, if two words are found to be coreferent, their meanings will be constrained to be equivalent) and/or change constraint valuations between two variables.



next up previous contents
Next: Nonserial Dynamic Programming Up: Hunters and Gatherers Previous: Other Strategies for



Steve Beale
Wed Mar 26 09:27:50 MST 1997