next up previous contents
Next: Subgraph Construction Up: The Hunter-Gatherer Control Previous: The Hunter-Gatherer Control

The Hunter-Gatherer Algorithm

A simple algorithm for HG is shown below. It accepts a list of subgraphs, ordered from smallest to largest, so that all input subgraphs are guaranteed to have been processed when needed in PROCESS-SUBGRAPH. For example, for Figure 15, HG would be input the subgraphs in the order (1,2,3,4,5). Each subgraph is identified by a list of input vertices, a list of input subgraphs, and a list of vertices that are constrained outside the subgraph. Subgraph 1 would have three input-vertices, no input subgraphs, and a single vertex constrained outside the subgraph. Subgraph 4, on the other hand, has subgraphs 2 and 3 as input subgraphs, a single input vertex, and one vertex constrained outside the subgraph. SS-HG simply calls PROCESS-SUBGRAPH for each of the input subgraphs. The last input subgraph is the graph containing the whole problem. The answer returned by PROCESS-SUBGRAPH for that input will be the answer to the whole problem.

  
Figure 16: HG Algorithm

The COMBINE-INPUTS procedure (line 7) simply produces all combinations of value assignments for the input vertices and subgraphs. The following three cases are possiblegif:

  1. The input subgraph contains no lower-level input-subgraphs. In this case, COMBINE-INPUTS returns all of the possible combinations of values for each of the In-Vertices. For instance, if two variables, A and B, each had domains 0,1, COMBINE-INPUTS would return (0,0),(0,1),(1,0),(1,1), where each combination is in the form (A,B).
  2. The input subgraph contains one lower-level subgraph and one or more In-Vertices. This, in practice, is the most common way to extend a subgraph - simply by adding on extra vertices. COMBINE-INPUTS adds on the possible combination of In-Vertices to the combinations present in the lower-level subgraph. It must be stressed that this lower-level subgraph has already been optimized by a previous call to PROCESS-SUBGRAPH. For example, if the lower-level subgraph had two variables, C and D, each with the domain 2,3, and had been reduced to the following combinations: (2,2), (2,3), where each combination is in the form (C,D), and two In-Vertices, A and B (as above) were being added, then COMBINE-INPUTS would return the following combinations: (0,0,2,2),(0,0,2,3),(0,1,2,2),(0,1,2,3),(1,0,2,2),(1,0,2,3), (1,1,2,2),(1,1,2,3), where each combination is in the form (A,B,C,D).
  3. There are two lower-level subgraphs and no In-Vertices. For this case, there are two separate sub-cases:
    1. The two lower level subgraphs do not have any vertices in common. In this case, COMBINE-INPUTS simply produces all of the combinations. If the first lower level subgraph had the two variables C and D as shown above, and the second lower level subgraph had two variables E and F, with domains 4,5, and had been reduced to the following combinations: (4,5),(5,5), then COMBINE-INPUTS would return the following combinations: (2,2,4,5),(2,2,5,5),(2,3,4,5),(2,3,5,5), where each combination is in the form (C,D,E,F).
    2. The two lower level subgraphs share one or more vertices. This is where the synthesis techniques come in. In this case, only compatible combinations are combined. For instance, if the first lower level subgraph had the variables C and D as shown above, and the second lower level subgraph had the variables C and G, each with the domain 2,3, and had been reduced to (2,2),(3,2), with each combination in the form (C,G), then COMBINE-INPUTS would return only the following two combinations: (2,2,2),(2,3,2), where each combination is of the form (C,D,G). Notice that the output combinations only contain three variables. Also notice that none of the combination (3,2) of the second lower level subgraph, where C is assigned to 3, were used, since none of the input combinations of the first lower level subgraph used that assignment.

The COMBINE-INPUTS procedure has complexity O(), where x is the number of vertices in In-Vertices, a is the maximum number of values in the domain of a vertex, and is the number of combinations in the lower level subgraph (which was already processed and reduced to a minimum by a previous call to PROCESS-SUBGRAPH). In the worst case, x will be n, the number of vertices; this is the case when the initial subgraph contains all the vertices and no subgraphs. Of course, this is simply an exhaustive search, not Solution Synthesis. In practice, In-Vertices contains no more than two or three vertices. In short, the complexity of Step 1 is the product of the complexity of the reduced outputs for the input subgraphs times the number of exhaustive combinations of input vertices. This complexity dominates the algorithm and will be what we seek to minimize below in the discussion of creating the input subgraphs.

Lines 8-10 will obviously be executed the same number of times as the complexity for line 7. An arc consistency (line 9) routine similar to AC-4 is used. It has complexity O(), where e is the number of edges in the graph. In the worst case, when the graph is a clique, e equals n! because every vertex affects every other vertex. Fortunately, SS-HG is aimed at problems of lesser dimensionality. Cliques are going to have exponential complexity no matter how they are processed. For us, the only vertices that will have edges outside the subgraph are those in Constrained-Vertices. Propagation of constraint failures by the AC-4 algorithm is limited to these vertices, and, indirectly, beyond them by the degree of inter-connectedness of the graph. It should be stressed that this arc-consistency mechanism is not responsible for the bulk of the search space pruning, and, for certain types of problems with ``fuzzy'' constraints (such as semantic analysis), it is not even used. The pruning associated with the branch-and-bound optimization typically overwhelms any contribution by the constraint satisfaction techniques. See section 8.1 for data comparing the efficiency of HG with and without arc-consistency.

The REDUCE-COMBOS procedure in line 11 simply goes through each Combos that passed arc consistency, and keeps track of the best one for each combination of values of Constrained-Vertices. The ``best'' is defined as the combination with the highest overall score. We discuss scoring, and how scores are tracked, in sections 4 and 5. If there are two Constrained-Vertices, each of which has four possible values, REDUCE-COMBOS should return 16 combinations (unless one or more combinations are impossible due to constraints), one for each of the possible combination of values of Constrained-Vertices, each optimized with respect to every other vertex in the subgraph. This process is pictured in Figure 17. The complexity of line 11 is therefore the same as the complexity of COMBINE-INPUTS. We refer to this complexity as the ``input complexity'' for the subgraph, whereas the ``output complexity'' is the number of combinations returned by REDUCE-COMBOS, which is equal to O(), where c is the number of vertices in Constrained-Vertices.

  
Figure 17: REDUCE-COMBOS

With this in mind, we can refigure the complexity of COMBINE-INPUTS. The output complexity of a subgraph is O(), which becomes a factor in the input complexity of the next higher subgraph. The input complexity of COMBINE-INPUTS is the product of O() times the complexity of the input lower-level subgraphs. Taken together, the complete input complexity of COMBINE-INPUTS, and thus the overall complexity of PROCESS-SUBGRAPH, is O(), where is the total number of Constrained-Vertices in all of the input lower-level subgraphs. In simple terms, the exponent is the number of vertices in In-Vertices plus the total number of Constrained-Vertices in each of the input subgraphs. To simplify matters later, we will refer to input complexity as the exponent value and the output complexity as the number of vertices in Constrained-Vertices.

The complexity of SS-HG will be dominated by the PROCESS-SUBGRAPH procedure call with the highest input complexity. Thus, when creating the progression of subgraphs (details below) input to SS-HG, we will seek to minimize the highest input complexity, which we will refer to as Maximum-Input-Complexity. To be precise, Maximum-Input-Complexity = for the subgraph with highest input complexity. As will be shown, some fairly simple heuristics can simplify this subgraph creation process.

For completion, it should be noted that PROCESS-SUBGRAPH is called X times in SS-HG, where X is the number of subgraphs in Subgraphs. A Subgraph with input complexity less than Maximum-Input-Complexity will run in time negligible compared to those with Maximum-Input-Complexity. Thus, Number-At-Maximum-Input-Complexity is an important measure, and the overall complexity of SS-HG will be:
O(Number-At-Maximum-Input-Complexity * )
Theoretically, one could create Subgraphs in such a way that there would be an exponential number of subgraphs. However, our algorithm below guarantees that there will be n or less subgraphs. Typically, only a portion of these have Maximum-Input-Complexity, thus the overall complexity is less than
O(n * ). We will demonstrate below that, for problems in computational semantics, is ``nearly'' a constant, and thus the complexity of HG will be linear with respect to the number of inputs, n.

Space complexity is actually more limiting than time complexity for solution synthesis approaches. Although one can theoretically wait ten years for an answer, no answer is possible if the computer runs out of memory. The space complexity of SS-HG is dominated by the need to store results for subgraphs that will be used as inputs later. This need is minimized by carefully deleting all information once it is no longer needed; however, the storage requirements can still be quite high. The significant measurement in this area is the maximum output complexity, because it determines the amount of output combinations stored by the subgraph with the highest output complexity. The highest output complexity is bounded by the highest input complexity, and thus is proportional to O(n * ). Again, it must be noted that this approach attempts to directly minimize the space complexity. Previous solution synthesis methods only reduced space (and time) complexity as an accidental by-product of the constraint satisfaction process. As will be shown below, their space complexity becomes unmanageable even for relatively small problems.

Please refer to section 5 for a detailed example of applying the HG algorithm to a semantic analysis problem.



next up previous contents
Next: Subgraph Construction Up: The Hunter-Gatherer Control Previous: The Hunter-Gatherer Control



Steve Beale
Wed Mar 26 09:27:50 MST 1997