We welcome this opportunity to share our experiences in text generation at the IJCAI-95 Workshop on Multilingual Generation. We are excited about many of the issues on the agenda. In particular, our experience in KBMT, particularly the DIOGENES Text Generation project, will give us a unique perspective from which to discuss the questions of IL representation. We hope to be able to participate in the discussions on this and related topics.
In this paper, however, we describe our latest experiment with the mechanism of text planning. The first two questions raised in the conference announcement are addressed in particular:
We attempt to demonstrate that, given the complexities of a realistic MLG situation, the system's control mechanism must be able to handle interacting constraints in an efficient manner.
Present text planners can be loosely grouped into one of three categories: hierarchical, systemic, and opportunistic/distributed. Hierarchical planners take an input semantic structure tree and traverse it, creating text plans (or even surface structures) as it goes. If incompatible decisions are detected, optional backtracking occurs (in some systems). The benefit of such processing is its ease in rule writing and processing. The drawback is, of course, backtracking. Incompatible (or non-optimal) decisions are not detected until after they are made and thus must be retracted. In practice, unrestrained backtracking results in unacceptable performance.
Systemic generation [Mann and Matthiessen 83] organizes choices into systems, or categories. Areas such as count, determination, mood, tense, etc., each have their own separate treatment. ``Realizers'' combine the individual decisions, while the grammar is the overall control mechanism that guides the whole process.
Systemic grammars are an excellent means of describing languages for use in text generation. They provide an explicit framework for discovering and recording the different choices associated with each grammatical feature, along with the reasons behind making one choice over another. On the negative side, interactions between systems are not explicitly represented. Realizers are used to combine the results of separate systems, but the systems themselves are kept separate from one another. This may be practical at the level of sentence generation, but planning coherent paragraphs requires a higher degree of interaction.
Opportunistic planners [Nirenburg, et al., 1989] try to take into account expected interactions between rules by planning ahead for them. Before making decisions that ultimately will depend on information not yet available, the opportunistic planner will first try to obtain the information. The benefit to this approach is that backtracking is avoided to a large extent. The main drawback is that it is extremely difficult to predict where and when interactions will occur. This makes the job of controlling the text generator extremely complex, which may (although we continue to work in this area) result in poor performance as well.
Our approach in this project is to try and combine the best of the
above approaches while trying to minimize their drawbacks. A
hierarchical control strategy is kept for its simplicity and obvious
applicability to the tree-structured semantic input.
Opportunistic retrieval of information is facilitated by the detailed
knowledge of dependencies. Backtracking is largely eliminated by
pre-analyzing known dependencies. Systemization, although not yet
implemented, will allow us to groups of rules, each directed at one
aspect of language, while retaining the ability to let the areas
interact in their decisions. In summary, one could call it an
island-driven, opportunistic, optimized, sound (need-based) recursive
planner. The significance of each of these will be briefly discussed
below, and, if time allows at the workshop, concretely demonstrated
with specific examples.