next up previous
Next: About this document Up: Ontology as a Previous: Ontology as a

Sharability and the Ten Commitments

  Although we are not committed to the format we use for concept representation, we are committed to the following characteristics of our ontology. If a standard emerged either in knowledge representation or in ontological representations such as, for example, the KIF (Genesereth and Fikes, 1992), these are the characteristics that we would consider in deciding whether and how to port the K ontology to the standard format.

The 10 Commitments:

1.
Broad coverage: Since our input texts are real-world, unedited news articles, they contain words which have meanings from practically any domain in the world. In order to represent those word meanings in the lexicons and to process the texts in the K system, the ontology must contain concepts that cover a broad range of topics in the world. As a result, a domain specific ontology such as one that may be sufficient for database merging or process model integration in a domain will not serve our purposes. This does not mean that our ontology must contain every conceivable concept in the world before we start using it for machine translation. On the other hand, it must cover many of the commonly used terms from a wide range of domains, leaving out the more technical terms from domains that are not the focus of our texts. For example, we do not need all the terminology used in neuroscience and brain surgery, but we are very likely to need commonly used terms such as brain and surgery in our ontology.
2.
Rich properties and interconnections: One of biggest uses of our ontology in processing natural language texts is in checking how well selectional constraints are satisfied. In the majority of cases, constraints are not directly satisfied by natural language texts. They are often partially satisfied by each of the possible meanings of an ambiguous piece of text. In order to compare the various choices against each other and determine the best choice, there must be a rich set of connections between concepts in the ontology in the form of a number of properties of concepts. Given any pair of concepts, we must be able to find the best (as in shortest or least cost) path between the two in the ontology. We want our ontology to act not only as a taxonomic classification of concepts in the world, but also very much as a semantic network.
3.
Ease of understanding, searching and browsing: The majority of our customers (lexicographers and testing and evaluating personnel) are not experts in knowledge representation. They are expert linguists or experts in a particular language or domain. It must be easy for them to search for a concept in the ontology, to browse the hierarchies, and understand the relationships between different concepts in the ontology. They should be able to do all this starting with just a rough sense (or gloss) of the meaning they are trying to represent in the lexicon or find in the ontology. In order to meet this goal, we have had to lean towards simplicity rather than aim for better expressive power or theoretical cleanliness in resolving a number of issues as noted throughout this report. Choices made for enhancing ease of understanding include:

Searching for a concept is facilitated by a simple string matching technique in our browsing tools. Users can search for a concept by providing a sub-string of its name or any sub-string of its English definition. An example of searching for all concepts related to FOOD is shown here. On the left are are concepts whose names have FOOD as a substring; on the right are all those with FOOD mentioned somewhere in their definition strings. In addition, our tools enable users to view the hierarchies graphically as well as jump from one concept to any other concept that it is related to.

4.
NLP-oriented: The K ontology is designed and built for machine translation. The kinds of knowledge that are needed in concept descriptions for machine translation may not be the same as what is needed for a reasoning or inference task. Our task is to extract and represent the ``direct'' meaning of the input text using the concepts in the ontology. We do not need to make elaborate inferences for this purpose.
5.
Economy/cost-effectiveness/tractability: As in most projects, we have limited resources to build the ontology. We don't have either the time or the number of people to expend several person-centuries to build the ontology. Instead, we must build a broad-coverage, usable ontology in only a few person months. The ontology must be usable at every stage in its development no matter how partial or inconsistent it is.
6.
Language independence: Concepts in the ontology must not be based on words in any one natural language. Our goal is to derive an interlingual meaning representation from any of a set of natural languages. Although we use English names for concepts, this is merely for convenience and readability. Concept names do not correspond one to one with English words or words in another language such as Spanish. A good example of a language dependent ontology is Wordnet.
7.
No unconnected terms: Every concept in the ontology must be related to other concepts in well-formed ways. There should no disjoint components in the ontological graph. This is necessary since virtually any two concepts might have to be compared for ``closeness'' in checking selectional constraints during NLP.
8.
Taxonomic organization and Inheritance: These are inevitable for representing word and text meanings succinctly. If we had just a list of concepts not organized in any hierarchical way, lexical semantic entries would be prohibitively long and tedious to acquire.
9.
Intermediate-level grain size: We do not want either elaborate decomposition of meanings into a small set of primitives or little decomposition where each word sense is its own atomic concept. We want word meanings to be decomposed to a significant extant so that relationships and commonalities in meanings between words become clear and so that meaning representations are not language specific. Yet, we do not want to enforce a closed set of primitive concepts into which all meanings should be decomposed. The search for such an ``ideal'' set of primitives becomes a research enterprise of its own, has been attempted by other researchers in the past, and does not suit our practical, situated approach well.
10.
Equal status for all properties: There is no necessary and sufficient, or central, or definitional part of the description of a concept. All properties have an equal status in determining what instances belongs to a concept. We want to be able to process texts from any domain using the ontology. Therefore impossible to determine what is a necessary property and what is peripheral to a concept.



next up previous
Next: About this document Up: Ontology as a Previous: Ontology as a



Kavi Mahesh
Sun Nov 12 13:56:16 MST 1995