Subgraph decomposition is a field in itself (see, for instance, (Bosák, 1990)), and we make no claims regarding the optimality of our approach. In fact, further research in this area can potentially improve the results reported below by an order of magnitude or more. The novelty of our approach is that we use the complexity of the HG algorithm as the driving force behind the decomposition technique. The overall complexity of HG is bounded by the maximum input complexity, as defined above. We attempt to minimize this feature in the algorithm given below:
To find a list of potential actions to take on a subgraph:
Figure 18 illustrates the process of subgraph creation. In step 1 of the algorithm, ``seed'' subgraphs are formed in various regions of the graph. These seeds are simply the smallest subgraphs in a region that cover at least one vertex. Seeds can be calculated in time O(n).
From there, we try to expand subgraphs until one of them contains the entire input graph. First, possible expansions for each seed are calculated, and for each seed, the one that requires the minimum input complexity is kept. The seed with the lowest input complexity is then allowed to expand. A type of ``iterative deepening'' approach is then used. The last created subgraph is allowed to expand as long as it can do so without increasing the maximum input complexity required so far. Once the current subgraph can no longer expand, new actions are calculated for any of the subgraphs that might have been affected by the expansion of previous subgraphs. These actions are then sorted with the remaining actions and the best one is taken. The process then repeats.
Several measures are taken to reduce complexity. First, once a subgraph is expanded, it can no longer be used in other subgraph expansions. This keeps the number of subgraphs under consideration to a minimum. Second, we only recalculate actions for subgraphs directly affected by previous expansions. If subgraph expansion has been occurring in one portion of the graph, only subgraphs in that portion need to have actions recalculated. Finally, as stated above, we allow the subgraph created last to expand as long as it can do so without increasing the maximum input complexity. In practice, we find that, after some subgraph combinations in the early stages, expansion occurs almost entirely by adding individual vertices onto existing subgraphs. This occurs because the input complexity of combining subgraphs is usually high. Because expansion generally occurs in this manner, it makes sense to allow the last created subgraph to continue expanding, if possible.
One major benefit of the overall approach used by HG is that, by examining the factors which influence the complexity of the main search, we can seek to minimize those factors in a secondary search. This secondary search can be carried out heuristically because the optimal answer, although beneficial, is not required, because it simply partitions the primary search space. Optimal answers to the primary search are guaranteed even if the search space was partitioned such that the actual search was not quite as efficient as it could be.
We also will demonstrate below that for any given problem, this subgraph creation process might be simplified. Problems in computational semantics, for instance, typically present as tangled trees (limited constraints between siblings and cousins superimposed on a basic tree structure). We can take advantage of this to create very simple algorithms that can partition the graph, sometimes better than this more complex version.
This section has been kept brief intentionally. Subgraph decomposition is not the focus of this research, although improvements can dramatically reduce the complexity of HG. In practice, the simple algorithm presented above produces excellent results; in all but the two smaller problems the optimal maximum input complexity was found in time negligible compared to the actual HG processing. Other approaches might be able to deliver better overall performance, but these results were deemed acceptable for now.
After a brief digression to provide the necessary background, these principles of subgraph creation, along with the Hunter-Gatherer algorithm, will be exemplified using problems from semantic analysis.