next up previous
Next: Types of Lexicon Up: No Title Previous: Introduction

Ontology-Based Modularization

  
Figure 1: Input Semantic Representation to the Mikrokosmos Generator

In contrast to modularization by tasks such as discourse structuring, clause structuring and lexical choice, the Mikrokosmos project (http://crl.nmsu.edu/Research/Projects/mikro/index.html) attempts to modularize on the ontological and linguistic data that serves as inputs to the text generation process, that is, based on the types of inputs we expect, not on the types of processing we need to perform.

The generation lexicon in our approach is essentially the same as the analysis lexicon, but with a different indexing scheme: on ontological concepts instead of NL lexical units, as in analysisgif ([Stede1996] is an example of another generator with a comparable lexicon structure, although our work is richer, including collocational constraints, for example). The generation lexicon contains information (such as, for instance, semantics-to-syntax dependency mappings) that drives the generation process, with the help of several dedicated microtheories that deal with issues such as focus and reference (values of which are among the elements of our input representations).

Lexicon entries in both analysis and generation can be thought of as ``objects'' or ``modules'' corresponding to each unit in the input. Such a module has the task of realizing the associated unit, while communicating with other objects around it, if necessary (similar to [De Smedt1990]).

Each module can be involved in carrying out several of the tasks like those listed by Wanner and Hovy. For instance, modules for specific events or properties are used in setting up clause and sentence structures as well as lexical choice, as will be shown below. Interactions and constraints flow freely, with the control mechanism dynamically tracking the connectionsgif. One outcome of this division of labor between declarative data and the control architecture is that the bulk of knowledge processing resides in the lexicon, indexed for both analysis and generation. This has greatly simplified knowledge acquisition in general [Nirenburg et al. 1996] and made it easier to adapt analysis knowledge sources to generation [Viegas and Beale1996] as well as to convert knowledge sources acquired for one language to use with texts in another.

Below we sketch out how this organization works. We begin by describing the main types of lexicon entries with the goal of demonstrating how each performs various generation tasks. We then take a look at the different types of constraints associated with each kind of entry.





next up previous
Next: Types of Lexicon Up: No Title Previous: Introduction



Steve Beale
Tue Feb 10 13:17:54 MST 1998