Returning to the example described earlier, the first step in
processing the input sentence is to recognize word boundaries and
analyze the sentence morphologically and syntactically. This is done
in
K using the Panglyzer Spanish analyzer developed in the Pangloss
project (Pangloss, 1994). The output of such analyses is a syntactic
structure of this fairly complex sentence. Panglyzer retrieves
entries from a Spanish lexicon for the words in the sentence and uses
syntactic information therein to build the syntactic structure of the
sentence.
In order to produce the meaning representation given the syntactic
structure,
K uses both semantic knowledge represented in the
Spanish lexicon and world knowledge represented in a
language-independent ontology. The lexicon represents meanings of
words by mapping them to concepts in the ontology.
In addition, it also specifies syntax-semantics mappings by binding
syntactic arguments to fillers of semantic roles in the slots of the
ontological concept. A text meaning representation (TMR) is the
result of instantiating concepts from the ontology that correspond to
the chosen senses of words in a text and linking them together
according to the constraints in the concepts as well as the
syntax-semantics mappings represented in the lexicon entries.
Skeletal TMRs thus constructed are also enhanced by various
microtheories which are specialized experts carrying different types
of knowledge of the language such as microtheories of space, time,
aspect, speaker attitudes, and so on.
Figure
shows one of the lexical entries for the Spanish word ``adquirir,'' the root form of ``adquirió'' in sentence (1). This entry, in its lex-map zone, maps to the concept named ACQUIRE in the ontology and binds the
syntactic arguments of the verb ``adquirir'' ($var1 and $var2 in the
syn-struc zone of the entry) to the agent and theme roles of the ACQUIRE event. The ontological concept for the ACQUIRE event
is shown in Figure
and has constraints on the
fillers of agent, theme, and other slots represented using other
concepts in the ontology. For example, the agent must be filled by a
HUMAN.
There is a second entry for ``adquirir'' in the Spanish lexicon
corresponding to a different sense of the word that maps to the LEARN concept in the ontology. It is one of the jobs of the
K semantic analyzer to select the right sense of ambiguous words such as
``adquirir.'' In this example, the analyzer picked ACQUIRE,
which is the appropriate sense in sentence (1), using ontological
information as explained below. Other examples of ambiguous words in
this sentence can be found in ``compañía'' and the preposition
``a-través-de.'' ``Compañía'' means either a CORPORATION or an INTERACT-SOCIALLY event. Similarly,
``a-través-de'' has a spatial location meaning and an instrument
meaning. Similarly, ``en'' and ``Doctor Andreu'' are also ambiguous.
Figure:
A lexical entry for the Spanish verb ``adquirir'' with
its semantic mappings to the ACQUIRE event.
Figure:
Frame representation for the concept ACQUIRE.
K is able to choose the appropriate meaning of a word by combining
information from its linguistic and world knowledge sources. For
example, in the case of ``adquirir,'' the analyzer instantiates both
the ACQUIRE and LEARN concepts and sets up constraints on
their slot fillers. These constraints come from both the lex-map
zone of the lexicon entry for the word ``adquirir'' and the slots in
the ACQUIRE and LEARN concepts themselves. After
identifying possible fillers per the syntax-semantic variable mappings
specified in the lexicon, the analyzer checks each constraint and
assigns a score to it. A constraint is checked by determining the
proximity of potential fillers to the specified constraint within the
ontological network.
The theme of ACQUIRE must be an OBJECT other than HUMAN while the theme of LEARN must be INFORMATION. These
constraints set up search tasks for the closeness of an ORGANIZATION (the potential filler being the Doctor Andreu
organization) and each of PHYSICAL-OBJECT and INFORMATION.
It turns out that the former is ``closer'' to ORGANIZATION than
the latter and hence gets a higher score. Using this score and
combining it with scores from all the other constraints on the
meanings in a sentence, the
K analyzer selects the ACQUIRE meaning of ``adquirir'' in the TMR for sentence (1). The analyzer does this search in the space of all the constraints in an efficient best-first manner using dependency analysis (Beale, Nirenburg, and
Mahesh, 1995). The ontological search method to determine the
``closeness'' of a pair of concepts is also valuable in figuring out
the meaning of a complex nominal (such as a compound noun) and in
processing metonymies and other nonliteral expressions.
Kavi Mahesh