Next:
About this document
Up:
Ontology as a
Previous:
Ontology as a
Although we are not committed to the format we use for concept
representation, we are committed to the following characteristics of
our ontology. If a standard emerged either in knowledge representation
or in ontological representations such as, for example, the KIF
(Genesereth and Fikes, 1992),
these are the characteristics that we would consider in deciding
whether and how to port the
K ontology to the standard format.
The 10 Commitments:
- 1.
-
Broad coverage: Since our input texts are real-world, unedited
news articles, they contain words which have meanings from practically
any domain in the world. In order to represent those word meanings in
the lexicons and to process the texts in the
K system, the ontology
must contain concepts that cover a broad range of topics in the world.
As a result, a domain specific ontology such as one that may be
sufficient for database merging or process model integration in a
domain will not serve our purposes. This does not mean that our
ontology must contain every conceivable concept in the world before we
start using it for machine translation. On the other hand, it must
cover many of the commonly used terms from a wide range of domains,
leaving out the more technical terms from domains that are not the
focus of our texts. For example, we do not need all the terminology
used in neuroscience and brain surgery, but we are very likely to need
commonly used terms such as brain and surgery in our ontology.
- 2.
- Rich properties and interconnections: One of biggest
uses of our ontology in processing natural language texts is in
checking how well selectional constraints are satisfied. In the
majority of cases, constraints are not directly satisfied by natural
language texts. They are often partially satisfied by each of the
possible meanings of an ambiguous piece of text. In order to compare
the various choices against each other and determine the best choice,
there must be a rich set of connections between concepts in the
ontology in the form of a number of properties of concepts. Given any
pair of concepts, we must be able to find the best (as in shortest or
least cost) path between the two in the ontology. We want our ontology
to act not only as a taxonomic classification of concepts in the
world, but also very much as a semantic network.
- 3.
-
Ease of understanding, searching and browsing: The majority of
our customers (lexicographers and testing and evaluating personnel)
are not experts in knowledge representation. They are expert linguists
or experts in a particular language or domain. It must be easy for
them to search for a concept in the ontology, to browse the
hierarchies, and understand the relationships between different
concepts in the ontology. They should be able to do all this starting
with just a rough sense (or gloss) of the meaning they are trying to
represent in the lexicon or find in the ontology. In order to meet
this goal, we have had to lean towards simplicity rather than aim for
better expressive power or theoretical cleanliness in resolving a
number of issues as noted throughout this report. Choices made
for enhancing ease of understanding include:
-
No complex concepts: each concept is represented in exactly one frame.
-
No disjunction in inheritance: a simple depth-first conjunctive
inheritance algorithm is used; inheritance is precomputed and
inherited slots displayed to the user.
-
No default inheritance: default values are not inherited.
-
No ontological instances.
-
No multiple views: there are no alternative views at the hierarchy,
concept, slot, or facet levels.
-
No sets and quantifiers: there are no set or quantifier notations in
the ontology; such expressiveness can be applied to the concepts in
lexical or TMR representations.
-
All slots are equal: there is no distinction between definitional and
factual or necessary and peripheral slots; a concept is the
conjunction of all its slots.
Searching for a concept is facilitated by a simple string matching
technique in our browsing tools. Users can search for a concept by
providing a sub-string of its name or any sub-string of its English
definition. An example of searching for
all concepts related to FOOD is
shown here.
On the left are
are concepts whose names have FOOD as a substring; on the right are
all those with FOOD mentioned somewhere in their definition
strings. In addition, our tools enable users to
view the hierarchies graphically as well as jump from one concept to
any other concept that it is related to.
- 4.
-
NLP-oriented: The
K ontology is designed and built
for machine translation. The kinds of
knowledge that are needed in concept descriptions for machine
translation may not be the same as what is needed for a reasoning or
inference task. Our task is to extract and represent the ``direct''
meaning of the input text using the concepts in the ontology.
We do not need to make elaborate inferences for this purpose.
- 5.
-
Economy/cost-effectiveness/tractability: As in most
projects, we have limited resources to build the ontology. We don't
have either the time or the number of people to expend several
person-centuries to build the ontology. Instead, we must build a
broad-coverage, usable ontology in only a few person months. The
ontology must be usable at every stage in its development no matter
how partial or inconsistent it is.
- 6.
-
Language independence: Concepts in the ontology must not be
based on words in any one natural language. Our goal is to derive an
interlingual meaning representation from any of a set of natural
languages. Although we use English names for concepts, this is merely
for convenience and readability. Concept names do not correspond one
to one with English words or words in another language such as
Spanish. A good example of a language dependent ontology is Wordnet.
- 7.
-
No unconnected terms: Every concept in the ontology must be
related to other concepts in well-formed ways. There should no
disjoint components in the ontological graph. This is necessary since
virtually any two concepts might have to be compared for ``closeness''
in checking selectional constraints during NLP.
- 8.
-
Taxonomic organization and Inheritance: These are inevitable for
representing word and text meanings succinctly. If we had just a list
of concepts not organized in any hierarchical way, lexical semantic
entries would be prohibitively long and tedious to acquire.
- 9.
-
Intermediate-level grain size:
We do not
want either elaborate decomposition of meanings into a small set of
primitives or little decomposition where each word sense is its own
atomic concept. We want word meanings to be decomposed to a
significant extant so that relationships and commonalities in meanings
between words become clear and so that meaning representations are not
language specific. Yet, we do not want to enforce a closed
set of primitive concepts into which all meanings should be
decomposed. The search for such an ``ideal'' set of primitives becomes
a research enterprise of its own, has been attempted by other
researchers in the past, and does not suit our practical, situated
approach well.
- 10.
- Equal status for all properties: There is no
necessary and sufficient, or central, or definitional part of the
description of a concept. All properties have an equal status in
determining what instances belongs to a concept. We want to be able
to process texts from any domain using the ontology. Therefore
impossible to determine what is a necessary property and what is
peripheral to a concept.
Next:
About this document
Up:
Ontology as a
Previous:
Ontology as a
Kavi Mahesh
Sun Nov 12 13:56:16 MST 1995