In the knowledge-based approach to machine translation, meanings of
source language (e.g., Spanish) texts are represented internally in a
language-neutral interlingua (e.g., Nirenburg, 1989).
The
interlingual meaning representation (that we call a
TMR) is derived from representations of word
meanings in computational lexicons and from representations of world
knowledge in ontologies (and possibly episodic knowledge bases). An
interlingual meaning representation once derived is input to a
language generator that produces the translation in the target
language (e.g., English). A key issue in the design and development of
KBMT systems is the set of symbols used to represent interlingual
meaning as well as the structure of the meaning representation. In our
methodology, the set of symbols and possible relationships among
them are grounded in a language-independent knowledge source called
the ontology. The symbols are defined as concepts in the
ontology. The same set of concepts are used to represent word meanings
in lexicons. The internal structure of these concepts is reflected in
both lexical and text meaning representations. TMRs are essentially
instantiations of ontological concepts connected together according to
constraints derived both from the ontology and elsewhere.
A typical dictionary definition of an ontology is ``The branch of metaphysics that studies the nature of existence.'' For us, an ontology is a computational entity, a resource containing knowledge about what ``concepts'' exist in the world and how they relate to one another. A concept is a primitive symbol for meaning representation with well defined attributes and relationships with other concepts. An ontology is a network of such concepts forming a symbol system where there are no uninterpreted symbols (except for numbers and a small number of known literals).
An ontology for NLP purposes is a body of knowledge about the world (or a domain) that a) is a repository of primitive symbols used in meaning representation; b) organizes these symbols in a tangled subsumption hierarchy; and c) further interconnects these symbols using a rich system of semantic relations defined among the concepts. In order for such an ontology to become a computational resource for solving problems such as ambiguity and reference resolution, it must be actually constructed, not merely defined formally. The ontology must be put into well-defined relationships with other knowledge sources in the system. In an NLP application, the ontology supplies world knowledge to lexical, syntactic, semantic, and pragmatic processes, and other microtheories.
An ontology is a database with information about
In interlingual machine translation, the principal reasons for using an ontology are (Figure 1):
In addition, ontologies are also used in KBMT
In addition, the same ontology can be of great value in a variety of
other tasks including database merging (Dowell, Stephens, and Bonnell,
1995; Van Baalen and Looby; 1995), integration of software or business
enterprise models (Fillion, Menzel, Mayer, and Blinn, 1995), and so
on. Essentially, an ontology such as the
K ontology is invaluable
wherever a ``semantic wall'' is to be scaled, be it to translate
between a pair of natural languages, a pair of database schemas, or to
integrate different models of the same domain or similar phenomena in the
world. We provide specific illustrations of the uses of the ontology
in NLP and MT below, but refer the reader to other literature for
examples of more broader uses of ontologies (IJCAI Ontology Workshop,
1995).
In the Mikrokosmos project, we have developed an ontology
covering a wide range of categories in the world. Several illustrative
concepts from the
K ontology will be shown below. The above uses of
the ontology in machine translation will also be illustrated through
examples. See Mahesh and Nirenburg (1995b) and Beale, Nirenburg, and
Mahesh (1995) for more detailed examples.