The algorithm described below was introduced in (Beale, et al, 1996). We summarized the approach as ``Hunter-Gatherer'' (HG):
1 PROCEDURE SS-HG(Subgraphs)2 FOR each Subgraph in Subgraph
3 PROCESS-SUB-GRAPH(Subgraph)
4 RETURN last value returnd by 3
5 PROCEDURE PROCESS-SUB-GRAPH(Subgraph)
;; assume Subgraph in form
;; (In-Vertices In-Subgraphs Constrained-Vertices)
6 Output-Combos <-- nil
;; STEP 1
7 Combos <--
COMBINE-INPUTS(In-Vertices In-Subgraphs)
8 FOR each Combo in Combos
;; STEP 2
9 IF ARC-CONSISTENT(Combo) THEN
10 Output-Combos <--
Output-Combos + Combo
;; STEP 3
11 REDUCE-COMBOS(Output-Combos Constr-Verts)
12 RECORD-COMBOS(Subgraph Output-Combos)
13 RETURN Output-Combos
A simple algorithm for HG is shown below. It accepts a list of subgraphs, ordered from smallest to largest, so that all input subgraphs are guaranteed to have been processed when needed in PROCESS-SUB-GRAPH. For example, for Figure 5, HG would be input the subgraphs in the order (1,2,3,4,5). Each subgraph is identified by a list of input vertices, a list of input subgraphs, and a list of vertices that are constrained outside the subgraph. Subgraph 1 would have three input-vertices, no input subgraphs, and a single vertex constrained outside the subgraph. Subgraph 4, on the other hand, has subgraphs 2 and 3 as input subgraphs, a single input vertex, and one vertex constrained outside the subgraph.
The COMBINE-INPUTS procedure (line 7) simply produces all combinations of
value assignments for the input vertices and subgraphs. This
procedure has complexity
O(
), where x is the number of
vertices in In-Vertices, a is the maximum
number of values for a vertex, and
is the complexity of the
input subgraph (which was already processed and reduced to a minimum
by a previous call to PROCESS-SUBGRAPH). In the worst case, x
will be n, the number of vertices; this is
the case when the initial subgraph contains all the vertices in the
graph. Of course, this is simply an exhaustive search, not
solution synthesis. In practice, each subgraph usually contains no more
than two or three input vertices. In
short, the complexity of Step 1 is the product of the complexity of
the reduced outputs for the input subgraphs times the number
of exhaustive combinations of input vertices. This complexity
dominates the algorithm and will be what we seek to minimize below in
the discussion of creating the input sub-graphs.
Lines 8-10 will obviously be executed the same number of times as the
complexity for line 7. An arc consistency (line 9) routine similar to
AC-4 (Mohr & Hendersen, 1986) is used. It has complexity
O(
), where e is the number of edges in the graph.
In the worst case, when the graph is a clique, e equals
n! because every vertex affects every other vertex.
Fortunately, SS-HG is aimed at problems of lesser dimensionality.
Cliques are going to have exponential complexity no matter how they
are processed. For us, the only vertices that will have edges outside
the subgraph are those in Constrained-Vertices. Propagation of
constraint failures by the AC-4 algorithm is limited to these
vertices, and, indirectly, beyond them by the degree of
inter-connectedness of the graph.
It should be stressed that this
arc-consistency mechanism is not responsible for the bulk of the
search space pruning, and, for certain types of problems with
``fuzzy'' constraints (such as semantic analysis), it is not even
used. The pruning associated with the branch-and-bound optimization
typically overwhelms any contribution by the constraint satisfaction
techniques. See the Graph Coloring section for data comparing the
efficiency of HG with and without arc-consistency.
The REDUCE-COMBOS procedure in line 11 simply goes through each combination,
and keeps track of the best one for each
combination of values of Constrained-Vertices. If there are two
Constrained-Vertices, each of which has four possible values,
REDUCE-COMBOS should return 16 combinations (unless one or more
are impossible due to constraints), one for each of the possible
combination of values of Constrained-Vertices, each optimized with respect to
every other vertex in the subgraph. The complexity of line 12 is
therefore the same as the complexity of COMBINE-INPUTS. We refer to
this complexity as the ``input complexity'' for the subgraph, whereas
the ``output complexity'' is the number of combinations returned by
REDUCE-COMBOS, which is equal to O(
), where c is the
number of Constrained-Vertices.
With this in mind, we can re-figure the complexity of COMBINE-INPUTS.
The output complexity of a subgraph is O(
), which becomes a
factor in the input complexity of the next higher subgraph. The input
complexity of COMBINE-INPUTS is the product of O(
) times
the complexity of the input subgraphs. Taken together, the complete
input complexity of COMBINE-INPUTS, and thus the overall complexity of
PROCESS-SUBGRAPH, is O(
), where
is the
total number of Constrained-Vertices in all of the input subgraphs.
In simple terms, the exponent is the number of vertices in In-Vertices
plus the total number of Constrained-Vertices in each of the input
subgraphs. To simplify matters later, we will refer to input
complexity as the exponent value
and the output
complexity as the number of vertices in Constrained-Vertices.
The complexity of SS-HG will be dominated by the PROCESS-SUBGRAPH procedure call with the highest input complexity. Thus, when creating the progression of subgraphs (see below) input to SS-HG, we will seek to minimize the highest input complexity, which we will refer to as Maximum-Input-Complexity. As will be shown, some fairly simple heuristics can simplify this subgraph creation process.
Space complexity is actually more limiting than time complexity for solution synthesis approaches. Although one can theoretically wait ten years for an answer, no answer is possible if the computer runs out of memory. The space complexity of SS-HG is dominated by the need to store results for subgraphs that will be used as inputs later. This need is minimized by carefully deleting all information once it is no longer needed; however, the storage requirements can still be quite high. The significant measurement in this area is, again, the maximum input complexity, because it determines the amount of combinations stored in line 7 of the algorithm. Again, it must be noted that this approach attempts to directly minimize the space complexity. Previous solution synthesis methods only reduced space (and time) complexity as an accidental by-product of the constraint satisfaction process. As will be shown below, their space complexity becomes unmanageable even for relatively small problems.