next up previous
Next: PICARD and Semantic Up: Intelligent Planning Meets Intelligent Previous: The Issue of

An Overview of PICARD

Semantic mismatch is just one of the complicating factors in Natural Language Generation. Add in other aspects of text planning such as coreference and ellipsis generation, sentence and text structuring, along with the basic decisions of lexical choice, and the process can become quite expensive computationally. When using large scale lexicons with complex text, we have found that the number of possible plan combinations can soar into the billions. Add to that interactions between the intelligent planner and the NLG component, as we argue for, and the complexity rises even more. Clearly, in the coming age of increased lexicon coverage, phenomena coverage and component interaction, computational methods must be developed to combat exponential explosion.

The PICARD Natural Language Generation system (Beale & Nirenburg, 1996) was developed with this need in mind. Before discussing the application of PICARD to context dependent multilingual planning, we give a brief introduction to the concepts that enable it to process Natural Language in near linear-time.

Beale (1996) introduced a new control strategy for computational semantic processing called HUNTER GATHERER. The HUNTER GATHERER methodology uses knowledge of constraint dependencies to identify sub-problems which can be processed independently. Solution synthesis methods are then utilized to combine (gather) solutions to these sub-problems, or circles, into larger and larger solutions until the entire sentence is analyzed. As solutions for each circle are created, branch-and-bound and constraint satisfaction techniques are used to prune away (hunt down) non-optimal solutions.

HUNTER GATHERER is a general control strategy that works particularly well for NL problems. Central to our application of this methodology to computational semantics, both in analysis and generation, is the hypothesis that such problems can almost always be viewed as bundles of tightly constrained sub-problems, each of which combine at higher, relatively constraint-free levels to produce a complete solution. Constraint dependency information, retrieved from the semantic co-occurrence information stored in the lexicon, which in turn exploits syntactic information, can be used to partition the complex problem into simpler sub-problems, or ``circles of dependency.''

In addition to solution synthesis techniques, HUNTER GATHERER employs branch-and-bound and constraint satisfaction methods to prune away non-optimal or impossible solutions from the search space. Beale, Nirenburg & Mahesh (1996) report ``near'' linear time processing for semantic analysis, with exhaustive search spaces in the trillions reduced to hundreds. The PICARD system extends the HUNTER GATHERER analysis capabilities to the field of text planning by using constraint circles to identify localized means-ends plan combinations. Constraint satisfaction techniques can then be used to ensure that only consistent plans are used together, while the other mechanisms in HUNTER GATHERER, solution synthesis and branch-and-bound, efficiently find the optimal solution.

The HUNTER GATHERER control architecture has been used extensively in the Mikrokosmos semantic analyzer. The following table shows actual results of semantic analyses of various size problems (from sentences selected randomly from our corpus):

            Sent A   Sent B    Sent C

# plans     79       95        119
exhaustive  7.8 Mil  56.7 Mil  235 Bil
HUNT-GATHER 179      254       327

It is interesting to note that a 20% increase in the number of total plansgif (79 to 95) results in a 626% increase (7.8M to 56M) in the number of exhaustive combinations possible, but only a 42% increase (179 to 254) in the number of combinations considered by HUNTER GATHERER. As one moves on to even more complex problems, a 25% increase (95 to 119) in the number of plans catapults the exhaustive complexity by 414,600% (56M to 235B) and yet only increases the HUNTER GATHERER complexity by 29% (254 to 327). As the problem size increases, the minor effects of non-local interactions diminish with respect to the size of the problem. We expect, therefore, the behavior of this algorithm to move even closer to linear with larger problems (for example, ones involving discourse).

Generation in the Mikrokosmos project is a relatively new development. Currently we are developing methods to reverse multilingual analysis lexicons (see Viegas and Beale, 1996). PICARD has been used to back-translate the semantic analyses of the Mikrokosmos analyzer using these reversed lexicons. Efficiency results similar to those reported above for HUNTER GATHERER were obtained.



next up previous
Next: PICARD and Semantic Up: Intelligent Planning Meets Intelligent Previous: The Issue of



Steve Beale
Tue Oct 1 10:59:09 MDT 1996