A simple algorithm for HG is shown below. It accepts a list of subgraphs, ordered from smallest to largest, so that all input subgraphs are guaranteed to have been processed when needed in PROCESS-SUBGRAPH. For example, for Figure 15, HG would be input the subgraphs in the order (1,2,3,4,5). Each subgraph is identified by a list of input vertices, a list of input subgraphs, and a list of vertices that are constrained outside the subgraph. Subgraph 1 would have three input-vertices, no input subgraphs, and a single vertex constrained outside the subgraph. Subgraph 4, on the other hand, has subgraphs 2 and 3 as input subgraphs, a single input vertex, and one vertex constrained outside the subgraph. SS-HG simply calls PROCESS-SUBGRAPH for each of the input subgraphs. The last input subgraph is the graph containing the whole problem. The answer returned by PROCESS-SUBGRAPH for that input will be the answer to the whole problem.
The COMBINE-INPUTS procedure (line 7) simply produces all combinations of
value assignments for the input vertices and subgraphs. The following
three cases are possible
:
The COMBINE-INPUTS procedure has complexity
O(
), where x is the number of
vertices in In-Vertices, a is the maximum
number of values in the domain of a vertex, and
is the
number of combinations in the lower level
subgraph (which was already processed and reduced to a minimum
by a previous call to PROCESS-SUBGRAPH).
In the worst case, x
will be n, the number of vertices; this is
the case when the initial subgraph contains all the vertices and no
subgraphs. Of course, this is simply an exhaustive search, not
Solution Synthesis. In practice, In-Vertices contains no more
than two or three vertices. In
short, the complexity of Step 1 is the product of the complexity of
the reduced outputs for the input subgraphs times the number
of exhaustive combinations of input vertices. This complexity
dominates the algorithm and will be what we seek to minimize below in
the discussion of creating the input subgraphs.
Lines 8-10 will obviously be executed the same number of times as the
complexity for line 7. An arc consistency (line 9) routine similar to
AC-4 is used. It has complexity
O(
), where e is the number of edges in the graph.
In the worst case, when the graph is a clique, e equals
n! because every vertex affects every other vertex.
Fortunately, SS-HG is aimed at problems of lesser dimensionality.
Cliques are going to have exponential complexity no matter how they
are processed. For us, the only vertices that will have edges outside
the subgraph are those in Constrained-Vertices. Propagation of
constraint failures by the AC-4 algorithm is limited to these
vertices, and, indirectly, beyond them by the degree of
inter-connectedness of the graph. It should be stressed that this
arc-consistency mechanism is not responsible for the bulk of the
search space pruning, and, for certain types of problems with
``fuzzy'' constraints (such as semantic analysis), it is not even
used. The pruning associated with the branch-and-bound optimization
typically overwhelms any contribution by the constraint satisfaction
techniques. See section 8.1 for data comparing the efficiency of HG with
and without arc-consistency.
The REDUCE-COMBOS procedure in line 11 simply goes through each Combos
that passed arc consistency, and keeps track of the best one for each
combination of values of Constrained-Vertices. The ``best'' is
defined as the combination with the highest overall score. We discuss
scoring, and how scores are tracked, in sections 4 and 5. If there are two
Constrained-Vertices, each of which has four possible values,
REDUCE-COMBOS should return 16 combinations (unless one or more combinations
are impossible due to constraints), one for each of the possible
combination of values of Constrained-Vertices, each optimized with respect to
every other vertex in the subgraph. This process is pictured in
Figure 17.
The complexity of line 11 is
therefore the same as the complexity of COMBINE-INPUTS. We refer to
this complexity as the ``input complexity'' for the subgraph, whereas
the ``output complexity'' is the number of combinations returned by
REDUCE-COMBOS, which is equal to O(
), where c is the
number of vertices in Constrained-Vertices.
With this in mind, we can refigure the complexity of COMBINE-INPUTS.
The output complexity of a subgraph is O(
), which becomes a
factor in the input complexity of the next higher subgraph. The input
complexity of COMBINE-INPUTS is the product of O(
) times
the complexity of the input lower-level subgraphs. Taken together, the complete
input complexity of COMBINE-INPUTS, and thus the overall complexity of
PROCESS-SUBGRAPH, is O(
), where
is the
total number of Constrained-Vertices in all of the input lower-level subgraphs.
In simple terms, the exponent is the number of vertices in In-Vertices
plus the total number of Constrained-Vertices in each of the input
subgraphs. To simplify matters later, we will refer to input
complexity as the exponent value
and the output
complexity as the number of vertices in Constrained-Vertices.
The complexity of SS-HG will be dominated by the PROCESS-SUBGRAPH
procedure call with the highest input complexity. Thus, when creating
the progression of subgraphs (details below) input to SS-HG, we will
seek to minimize the highest input complexity, which we will refer to
as Maximum-Input-Complexity. To be precise,
Maximum-Input-Complexity =
for the subgraph with
highest input complexity. As will be shown, some
fairly simple heuristics can simplify this subgraph creation process.
For completion, it should be noted that PROCESS-SUBGRAPH is called X
times in SS-HG, where X is the number of subgraphs in Subgraphs.
A Subgraph with input complexity less than Maximum-Input-Complexity
will run in time negligible compared to those with
Maximum-Input-Complexity. Thus, Number-At-Maximum-Input-Complexity is
an important measure, and the overall complexity of SS-HG will be:
O(Number-At-Maximum-Input-Complexity *
)
Theoretically, one could create
Subgraphs in such a way that there would be an exponential number of
subgraphs. However, our algorithm below guarantees that there will be n
or less subgraphs. Typically, only a portion of these have
Maximum-Input-Complexity, thus the overall complexity is less than
O(n *
). We will demonstrate
below that, for problems in computational semantics,
is ``nearly'' a constant, and thus the
complexity of HG will be linear with respect to the number of inputs, n.
Space complexity is actually more limiting than time complexity for
solution synthesis approaches. Although one can theoretically wait
ten years for an answer, no answer is possible if the computer runs out
of memory. The space complexity of SS-HG is dominated by the need to
store results for subgraphs that will be used as inputs later. This
need is minimized by carefully deleting all information once it is no
longer needed; however, the storage requirements can still be quite
high. The significant measurement in this area is the maximum output
complexity, because it determines the amount of output combinations
stored by the subgraph with the highest output complexity. The
highest output complexity is bounded by the highest input complexity,
and thus is proportional to O(n *
).
Again, it must be noted that this approach attempts to directly
minimize the space complexity. Previous solution synthesis methods
only reduced space (and time) complexity as an accidental by-product
of the constraint satisfaction process. As will be shown below, their
space complexity becomes unmanageable even for relatively small
problems.
Please refer to section 5 for a detailed example of applying the HG algorithm to a semantic analysis problem.