%set_1 cardinality = 3 elements $John $Mary $Sue(where %set_1 is an instance of the concept *set, i.e. a frame representation of the above would include an INSTANCE-OF *SET slot)
The industry leaders, such as Mitsubishi Corp. and Sumitomo Corp., previously decided....The subject of this sentence is the set of industry leaders. Two of the members of the set are explicitly named, but there may be arbitrarily many other members which have not been named.
We may treat such indefinite sets with a generalized form of the representation used for definite (enumerated) sets. In addition to the CARDINALITY and ELEMENTS slots, a MEMBER-TYPE slot will provide a template specifying the constraints which any members of the set must satisfy. A suggested TMR for the set in the above example is
(%set_1 (instance-of *set) (member-type %industry-leader_1) (cardinality >= 2) (elements %Mitsubishi %Sumitomo) )or
%set_1 member-type %industry-leader_1 ...other slots describing leaders cardinality >= 2 members (%Mitsubishi %Sumitomo)The frame referenced in the MEMBER-TYPE slot is a template looking something like
(%industry-leader_1 (instance-of *company) (business-type ....) ... (size >= 0.8) ; among the largest companies )which specifies that any member of %SET_1 must be a *COMPANY (or some subclass thereof) whose SIZE measure is at least 0.8, whose BUSINESS-TYPE unifies with the specified value, etc.
The MEMBER-TYPE slot is useful but not required when defining a set where all members are enumerated, but is absolutely required for indefinite sets. The distinction between enumerated and indefinite sets is made solely by comparing the CARDINALITY and ELEMENTS slots. An enumerated set has CARDINALITY exactly equal to the length of the list in the ELEMENTS slot, while an indefinite set has CARDINALITY which is either greater than or equal or strictly greater than the number of enumerated elements. CARDINALITY = 0 is a special case which is defined to be an indefinite set requiring a MEMBER-TYPE slot; it is used in conjunction with existentials and superlatives, discussed in later sections.
Indefinite sets may optionally contain a COMPLETE slot; if present and non-empty, this slot indicates that the set contains all possible members, i.e. every item in the "universe of discourse" which matches the MEMBER-TYPE template is in fact an element of the set. This is used to represent phrases such as "all college students", and will also be used for ordinals (see below). The canonical value for the COMPLETE slot when it is present is "yes".
;; set of all college students %set_1 member-type *college-student complete yesAnother optional slot is EXCLUDING, which lists potential members of the set which have explicitly been excluded, as in "companies other than IBM and Apple".
%set_1 member-type *company cardinality > 1 excluding $IBM $Apple
A
SIZE(A) < 5.
When one of the sides of the relation expression is a set, the relation
will be defined to hold for each and every member of the set, i.e.
SIZE(SET A B) < 5
indicates that SIZE(A) is less than five and also SIZE(B) is less than
five. Similarly, if both sides of the relation expression are sets, the
relation will be defined to hold for each and every possible pairing of
members of the two sets, i.e.
(SET A B) < (SET C D)
is equivalent to
(A < C) and (A < D) and (B < C) and (B < D)
Finally, we link the indefinite set back to the item to which the
ordinal applies by relating that item to the indefinite set. Thus,
Yamaha is smaller than the members of %SET_2 (which by the definition
of %SET_2 are larger than Yamaha), so the frame for Yamaha would appear
in part as
This method of expressing ordinals in terms of indefinite sets also
works for "first" and other superlatives, as discussed in a later
section.
The inverse, a statement of existence such as "there are fireflies on
the moon" uses a nearly identical TMR. Instead of CARDINALITY = 0,
however, such a set has CARDINALITY >= 1. Similarly, "there are ten
fireflies on the moon" would produce a set with CARDINALITY = 10.
At this time, it has not yet been determined how to represent
indefinite sets with a "fuzzy" range of cardinality. Such sets would
form the TMRs for phrases like "there are a few fireflies on the moon"
or "there are several fireflies on the moon".
In general, "the
The item to which the superlative applies can easily be a set, as in
"the ten largest motorcycle manufacturers". This is a statement that
the set of motorcycle manufacturers which are larger than the desired
set of ten motorcycle manufacturers is empty, i.e.
To represent all of these types of subsets, we augment the SET notation
with one mandatory (for subsets) slot and three new optional slots. All
subsets will have a SUBSET-OF slot to indicate the subsetting relationship
and the set of which another is a subset. In addition, the subset will
have a CARDINALITY slot if it is of a known cardinality, a MULTIPLE slot
if it encompasses a known proportion of the full set, a MEMBER-TYPE slot
if it adds further constraints to membership, and/or an INDETERMINATE slot
if it is an unspecified subset. If the subset is known to be a proper
subset (i.e. is not identical to the main set), then it will also contain
the optional PROPER slot. The name MULTIPLE is used (at Donalee's
suggestion) rather than PROPORTION because a MULTIPLE slot is already in
use elsewhere, and thus using it avoids creating a new ontological entity.
The CARDINALITY and MEMBER-TYPE slots have already been covered, so they
will not be discussed any further here. The INDETERMINATE and PROPER slots
are boolean slots, i.e. it only matters whether they are present and
non-empty or not; the canonical value for these slots when they are present
is YES. The MULTIPLE slot specifies the proportion between 0.0 and 1.0
(inclusive) of the elements of the full set which are also members of the
subset; the slot's value may be either a specific number or a range.
To illustrate, consider the following fake subset frame containing all of
the optional subset slots (which would never occur in actual fact):
Just as indefinite sets can enumerate some of their members in the
ELEMENTS slot, a subset may also enumerate some of its members in
ELEMENTS. Thus, "some college students, such as John and Mary" would be
represented as
Representing a set of events is nearly identical to representing a set
of objects. For an enumerated set, one merely enumerates the events
instead of enumerating a list of objects. For an indefinite set, the
MEMBER-TYPE slot defines a template which is either an event in the
ontology or an instance of one, rather than an object.
Consider the statement "That excuse has only worked twice." What this
sentence is really saying is "Of all the attempts to use that excuse for
Ordinals
Given a facility for describing sets with unknown members, we can
express ordinals in terms of indefinite sets, as I worked out with
Donalee. Simply create an indefinite set with cardinality one less
than the desired ordinal whose MEMBER-TYPE template includes the
desired ordering relation. For example,Yamaha Motor Co., Japan's second largest motorcycle maker,
....
would result in the indefinite set of motorcycle manufacturers which
are larger than Yamaha, with the set having cardinality 1. That is,
(%set_2
(instance-of *set)
(member-type %manufacturer_1)
(cardinality 1)
(complete yes)
)
(%manufacturer_1
(instance-of *manufacturer)
(product *motorcycle)
(size %quantifier-rel_1)
)
(%quantifier-rel_1
(type > )
(arg-1 %manufacturer_1)
(arg-2 Yamaha)
)
or more compactly,
%set_2
member-type %manufacturer_1
product *motorcycle
size > Yamaha
cardinality = 1
complete yes
Note the COMPLETE slot, which is used to distinguish this set from an
explicitly-mentioned set as in "IBM and Apple sold more computers than
Gateway 2000", whose representation would otherwise be confused with
"Gateway 2000 is the third-largest computer manufacturer, behind IBM
and Apple". As discussed in the section on indefinite sets, this slot
indicates that the set contains all possible elements matching its
MEMBER-TYPE template. In the context of ordinals, this specifies that
there are no objects satisfying the relation between the helper set and
the object or set in question except those (possibly none) contained in
the helper set.
(Yamaha
(instance-of *manufacturer)
(product *motorcycle)
(size %quantifier-rel_2)
...
)
(%quantifier-rel_2
(type < )
(arg-1 Yamaha)
(arg-2 %set_2)
)
or more compactly,
Yamaha
instance-of *manufacturer
product *motorcycle
size < %set_2
Had Yamaha been the tenth-largest motorcycle manufacturer instead of
the second largest, the only change in the TMR would have been to set
the cardinality of %SET_2 to 9 instead of 1. Had it been tenth-largest
and had some of the larger companies explicitly been mentioned, %SET_2
would also include the explicitly-mentioned companies in its ELEMENTS
slot.
Existentials
Indefinite sets allow the representation of phrases of the form "there
are no X" through a set with cardinality zero, i.e. an empty set. The
TMR for "X" in the phrase is used as the template for the MEMBER-TYPE
slot of the set, the CARDINALITY slot is set to zero, and the ELEMENTS
slot is left empty. For example, "there are no fireflies on the moon"
would be represented as
%set_3
member-type %firefly_1
location $"moon"
cardinality = 0
complete yes
elements
which glosses as "the set of fireflies located on the moon is empty".
Once again, the COMPLETE slot is used to indicate that all possible
members of the set are in fact elements.
Superlatives
Superlatives are statements that some measure is the extreme on a
particular scale, i.e. that there are no items for which the measure
has a more extreme value. Therefore, they can be represented by
relating the desired entity to an empty set on the specified measure.
Had the Yamaha example read
Yamaha Motor Co., Japan's largest motorcycle maker, ....
the TMR would have set the cardinality for %SET_2 to zero instead of one.
%set_N
member-type
where
%set_4
member-type %manufacturer_2
product *motorcycle
size > %set_5
cardinality = 0
complete yes
elements
%set_5
member-type %manufacturer_3
product *motorcycle
size < %set_4
cardinality = 10
elements
Partial< superlatives such as "some of my favorite things" may be
represented using indeterminate subsets of the set representing the
embedded superlative, as described in the next section.
Subsets
Frequently, a set is known to be a subset of another; sometimes, that is
*all* that is known about the set. There are four main types of
subsets:
Either category 1 or 2 may be combined with category 3, for a subset
with additional constraints that also specifies a number or proportion of
the original set's members. Also, categories 3 and 4 can be combined for
a subset which does not contain all members fitting the more specific
membership constraints. Further, a subset may be known to be a proper
subset, i.e. missing at least one element of the full set.
%set_57
subset-of %set_42
member-type %car_1
color *red
multiple (range 0.25 0.99)
cardinality (range 1 500)
indeterminate yes
proper yes
This frame specifies that %SET_57 is a subset of %SET_42 containing only red
cars, that between 25% and 99% of the elements of %SET_42 are also members
of %SET_57, that %SET_57 contains between one and 500 of the members of
%SET_42, that %SET_57 is an unspecified subset of %SET_42, and that %SET_42
contains at least one element not in %SET_57.
%set_1
member-type *college-student
cardinality >= 2 ; we know at least two because of subset
%set_2
subset-of %set_1
cardinality >= 2
elements $John $Mary
Partial Superlatives
As was mentioned at the end of the section on superlatives, partial
superlatives such as "some of my favorite things" may be represented using
indeterminate subsets.
;; "my favorite things"
%set_7
member-type %thing_1
favor < %set_8
cardinality > 1
;; all things more favored than %set_7 (note: empty)
%set_8
member-type %thing_2
favor > %set_7
cardinality = 0
complete yes
;; "some of %set_7" = "some of my favorite things"
%set_9
subset-of %set_7
indeterminate yes
Similarly, partial superlatives which indicate the number of items, such
as "five of the largest companies" or "half of all women" can be
represented using a subset of known cardinality or proportion, i.e.
;; "the largest companies"
%set_10
member-type %company_1
size < %set_11
cardinality > 1
;; all companies larger than those in %set_10 (note: empty)
%set_11
member-type %company_2
size > %set_10
cardinality = 0
complete yes
;; five items from %set_10 = "five of the largest companies"
%set_12
subset-of %set_10
cardinality = 5
Intermittent Events
Another application of subsets is in representing intermittent events
evoked by phrases such as "sometimes, X" or "in some instances, X".
Here, the desired set is an indeterminate subset of some set of events
rather than objects. It is also possible to have a better-determined
subset, such as "half the time, X" or "X has only succeeded twice".
;; the set of all attempts to use the excuse
%set_1
member-type %attempt_1
theme %use_1
theme %excuse_1
cardinality >= 2
complete yes
; we know there must be at least elements since the subset has two
;; two arbitrary members of %set_1 which were successful
%set_2
subset-of %set_1
cardinality 2
!!! unsolved: how to represent 'successful attempt' in this context
Representing "OR"
The English word 'or' implies a set, much as the word 'and' does.
However, 'or' can have several First Order Logic meanings, which are
represented differently--it can mean 'inclusive or', 'exclusive or',
and sometimes even 'and'. The final case has already been dealt with
in the form of enumerated sets, so it will not be discussed further in
this section. Determining which sense of 'or' is being used is beyond
the scope of this document on set phenomena.Inclusive OR:
An inclusive 'or' means that at least one member of some universe of
possible values is included in the set. We can represent this as a
subset relation between the actual members of the 'or' and the
possible values which have been enumerated, as in
; an unknown subset with at least one member
%set_N
cardinality >= 1
subset-of %set_M
indeterminate yes
; the universe of discourse for the 'or', of which the actual
; members are a subset
%set_M
members possvalue1, possvalue2, ...
A more succint way of representing the 'or' is to make use of the
MEMBER-TYPE slot to specify the universe of possible values, i.e.
%set_N
cardinality >= 1
member-type possvalue1, possvalue2, ...
but this latter representation may prove problematic if
implementations assume that MEMBER-TYPE is a single value rather than
an implicit union of all values in the slot.Exclusive OR:
An exclusive 'or' means that exactly one member of some universe of
possible values is included in the set. The representation of an
exclusive 'or' is almost identical to an inclusive 'or', except that
the cardinality is specified as "= 1" rather than ">= 1". A meta-example
is
%set_N
cardinality = 1
member-type possvalue1, possvalue2, ...
Examples
In this section, numerous examples will illustrate the use of sets and
subsets in representing various phrases. In addition, there are tables
showing how to gloss phrases in preparation for determining the TMR of
the phrase. As can be seen from the examples, the difference between
the various types (ordinals, superlatives, etc.) is often fairly
subtle. Much of the information about the phrase being represented is
contained in the relationship between the various sets making up the
TMR fragment for the phrase rather than in the sets themselves.Superlatives
Text Gloss As
-------------------- -----------------------
first X (time) there are no Y before X
last X (time) there are no Y after X
largest X there are no Y larger than X
smallest X there are no Y smaller than X
favorite X there are no Y more favored than X
least favorite X there are no Y less favored than X
most likely X there are no Y more likely than X
least likely X there are no Y less likely than X
loudest X there are no Y louder than X
quietest X there are no Y less loud than X
first 3 X (time) there are no Y before the indefinite set of 3 X
top 10 X there are no Y greater than the indef. set of 10 X
last 10 X (time) there are no Y after the indefinite set of 10 X
-------------------- -----------------------
some of the top 10 X indeterminate subset of (there are no Y greater than
the indefinite set of 10 X)
many of the top 10 X subset, multiple 0.33-0.6, of (there are no Y greater
than the indefinite set of 10 X)
most of the top 10 X subset, multiple 0.6-0.99, of (there are no Y greater
than the indefinite set of 10 X)
two of the top 10 X subset, cardinality 2, of (there are no Y greater
than the indefinite set of 10 X)
Ordinals
Text Gloss As
-------------------- -----------------------
the second-largest X there is exactly one Y larger than X
the third X (time) there are exactly two Y before X
the next-to-last X there is exactly one Y after X
Intermittent Events
Text Gloss As
-------------------- -----------------------
sometimes, X indeterminate subset of events for X
"OR"
Text Gloss As
-------------------- -----------------------
X, Y, or Z (inclusive) at least one of X,Y,Z
X, Y, or Z (exclusive) exactly one of X,Y,Z
X, Y, or Z (and) the set of X,Y,Z
------------------
Enumerated set: "John, Mary, and Sue"
%set_1
member-type *person ; optional
cardinality = 3
elements $John $Mary $Sue
------------------
Simple set: "three apples in the bowl"
%set_1
member-type %apple_1
location %bowl_1
cardinality = 3
------------------
Emumerated set: "an apple, a banana, and an orange in the bowl"
%set_2
member-type %fruit_1
location %bowl_1
cardinality = 3
elements %apple_1 %banana_1 %orange_1
**OR**
;; note: no MEMBER-TYPE slot, since it is optional
%set_2
cardinality = 3
elements %apple_1 %banana_1 %orange_1
%apple_1
location %bowl_1
%banana_1
location %bowl_1
%orange_1
location %bowl_1
------------------
Indefinite set: "some apples in the bowl"
%set_3
member-type %apple_2
location %bowl_1
cardinality > 1
Note: the similar phrase "some of the apples in the bowl" is a subset;
depending on context, "some apples in the bowl" may actually mean the
former. I.e. "Some apples in the bowl are unripe."
------------------
Ordinal: "The third-largest company in Japan"
;; the set of all Japanese companies larger than the one in question; there
;; happens to be exactly two such companies
%set_1
member-type %company_1
size > %company_2
location $Japan
cardinality = 2
complete yes
;; the company in question is smaller than all members of %set_1; since
;; there are two members of %set_1, the company is now known to be the
;; third-largest
%company_2
size < %set_1
location $Japan
------------------
Ordinal: "The penultimate step"
;; the set of all steps after the one in question; there happens to be
;; exactly one
%set_1
member-type %step_1
time > %step_2
cardinality = 1
complete yes
;; the step in question comes before all the members of %set_1; since
;; there is exactly one member, %step_2 is the next-to-last
;; (penultimate) step
%step_2
time < %set_1
------------------
Existential: "There are no leprechauns"
;; the set of all leprechauns is empty
%set_1
member-type *leprechaun
cardinality = 0
complete yes
------------------
Existential: "There are honest politicians" (some would claim this is false :-)
;; the set of all honest politicians is non-empty
%set_1
member-type %politician_1
honesty 1.0
cardinality >= 1
complete yes
Note: some statements which might otherwise be considered existentials are
represented simply by stating the appropriate entity in the TMR, i.e.
"John is an honest politician" becomes
%John_1
profession *politician
honesty 1.0
------------------
Existential: "There are at least 50 ways to leave your lover"
;; the set of all ways to leave your lover has at least 50 members
%set_1
member-type %method_1
purpose %leave_1
theme %lover_1
cardinality >= 50
complete yes
------------------
Superlative: "The first man on the moon" (note: singular)
;; set of all (zero) men on moon before %human_2
%set_1
member-type %human_1
location *moon
time < %human_2
cardinality 0
complete yes
;; man on moon such that all those (none) in %set_1 were there earlier
%human_2
location *moon
time > %set_1
------------------
Superlative: "The first men on the moon" (note: plural)
;; set of all (zero) men on moon before those in %set_2
%set_1
member-type %human_1
location *moon
time < %set_2
cardinality 0
complete yes
;; men on moon such that all those (none) in %set_1 were there earlier
%set_2
member-type %human_2
location *moon
time > %set_1
cardinality > 1
------------------
Superlative: "The three oldest people"
;; set of all (zero) people older than those in %set_2
%set_1
member-type %human_1
age > %set_2
cardinality 0
complete yes
;; the three people such that all those (none) in %set_1 are older
%set_2
member-type %human_2
age < %set_1
cardinality 3
------------------
Superlative: "brown is my least favorite color"
;; set of all (zero) colors less favored than brown
%set_1
member-type %color_1
favor-amount < %color_2
cardinality 0
complete yes
;; brown is favored more than any color in %set_1, of which there are none
%color_2
value *brown
favor-amount > %set_1
------------------
Superlative: "some of my favorite things"
;; set of most-favored things
%set_1
member-type %thing_1
favor-amount < %set_2
cardinality > 1
;; helper set which defines %set_1 to be the most-favored things
%set_2
member-type %thing_2
favor-amount > %set_1
cardinality = 0
complete yes
;; finally, the actual set we want, an unspecified subset of the
;; most-favored things
%set_3
subset-of %set_1
indeterminate yes
------------------
Superlative: "two of my favorite things"
;; set of most-favored things
%set_1
member-type %thing_1
favor-amount < %set_2
cardinality > 1
;; helper set which defines %set_1 to be the most-favored things
%set_2
member-type %thing_2
favor-amount > %set_1
cardinality = 0
complete yes
;; finally, the actual set we want, two arbitrary members of %set_1
%set_3
subset-of %set_1
cardinality = 2
------------------
Subset with known number of elements: "Seven auto mechanics took
advanced classes. Two of them failed."
%set_1 ; seven auto mechanics
member-type *auto-mechanic
cardinality 7
%set_2 ; two of them
subset-of %set_1
cardinality 2
------------------
Subset with known proportion: "half of all college students"
;; the set of all college students
%set_1
member-type *college-student
cardinality >= 1
complete yes
;; half of %set_1
%set_2
subset-of %set_1
multiple 0.5
------------------
Subset with additional constraints: "A group of college students.... The
seniors among them...."
;; a set of multiple college students
%set_1
member-type *college-student
cardinality > 1
;; the seniors in %set_1
%set_2
subset-of %set_1
member-type %college-student_1
year 4
------------------
Indeterminate Subset: "A group of college students.... Some of them...."
;; a set of multiple college students
%set_1
member-type *college-student
cardinality > 1
;; an unspecified subset of the college students in %set_1
%set_2
subset-of %set_1
indeterminate yes
------------------
Indeterminate subset: "Some of the apples in the bowl"
;; the apples in the bowl
%set_3
member-type %apple_2
location %bowl_1
cardinality > 1
;; some of %set_3
%set_4
subset-of %set_3
indeterminate yes
------------------
Indeterminate subset with additional membership constraints: "Some of the
red lollipops...."
;; an indefinite set of lollipops
%set_1
member-type *lollipop
cardinality > 1
;; some of the members of %set_1 which are also red
%set_2
subset-of %set_1
member-type %lollipop_1
color *red
indeterminate yes
(In practice, the above would only result from a phrase such as "some of the
red ones" which is later coreferenced with "lollipops"; given the original
example phrase, one would naturally create %set_1 as red lollipops and
%set_2 as an indeterminate subset without additional constraints.)
------------------
Intermittent Event: "Sometimes, the good guys win"
;; an indefinite set of all events where good guys can win or lose
%set_1
member-type ...
cardinality ...
complete yes
...
;; an unspecified subset of %set_1
%set_2
subset-of %set_1
indeterminate yes
------------------
Complex Examples
Interlinked Sets: "84 cities in 47 countries"
;; a set of 84 cities of a particular kind
%set_1
cardinality 84
member-type *city_1
;; a generic instance of a city in a constrained location
*city_1
location *set_2
;; a generic set to define an arbitrary country in a given set of countries,
;; where the exact country can vary each time the set is referenced above
*set_2
subset-of %set_3
cardinality 1
indeterminate yes
;; a set of 47 countries
%set_3
cardinality 47
member-type *country_1
;; a generic instance of a country, which is required to encompass the
;; specified set of cities
*country_1
instance-of *country
cities %set_4
;; a set of cities such that each city is located somewhere in a particular
;; set of countries
%set_4
cardinality >= 1
member-type *city_1
The above glosses approximately as: "the set of 84 cities such that each
city is in an undetermined country from among a set of 47 countries, each
country containing at least one of those cities"
Summary
Sets and, in particular, indefinite sets, have a wide range of uses in
the Text Meaning Representation. With indefinite sets, we can represent
collections where not all members are known; this in turn allows us to
represent existential statements, ordinals, and superlatives. By creating
subsets, we can further represent concepts such as "some".
By Ralf Brown, ralf@cs.cmu.edu