This tutorial assists in the translation of a text in a natural language into the text meaning representation used by Mikrokosmos.
This tutorial offers a step-by-step analysis of a sample text in
English and explains how the various slots and fillers available in
the Mikrokosmos
This tutorial will teach you how to:
When the text is in English, you are ready to begin.
The TMR shell will look like this:
[include a picture or representation of the TMR shell]
As you can see, the TMR shell provides all possible
slots so that you will only need to provide
fillers.
Those slots which do not apply to the text fragment you are transcribing may be left blank.
Let's look at an example. Consider the following text:
To separate this text into clauses, first
highlight or pull out the verbs or verb phrases of each sentence:
so we have our clauses:
to set up a joint venture in Brazil possibly this summer
to produce ferrosilicon
reported Nikkan Kogyo
Notice that although this text is only one sentence long, there are at
least four clauses in it. Also notice that when you have a verb,
a subject is implicit, even if it is not mentioned explicitly in the
text (an action infers someone or something is doing the
action).
It can be overwhelming to attempt to transcribe an entire sentence, so
it is important at this stage to only transcribe one clause at a
time. So first, transcription of the
first clause.
First Clause:
Consequently, agree is the verb we should use as the head of
the current clause.
In addition, the circumstantial roles must be filled in if they are appropriate to the clause. These are roles which relate events to more
circumstantial pieces of information that describe them, such as:
But what about the clause we're working on? To refresh your memory,
here it is again:
Going down the list of available case roles, let's
begin by seeing if agree has an agent.
If only things were that simple! But there is more to the text than
that. Note that in the original clause, "Kawasaki Steel Corp." is
given special mention; it has "reached agreement" with the
other two companies, which are placed in a clause-end prepositional
phrase. So, in fact, "Kawasaki Steel Corp." is the "agent," while the
other two companies fill the case slot accompanier.
The
However, we immediately run into a problem: each
slot can take only one filler, so how can we put two companies into
one slot?. To handle multiple fillers in a slot,
we use a
set notation:
The accompanier slot has been filled, but now
%
set_1 must be defined (as we have defined the agent slot).
To discover if agree requires a theme case role, ask
yourself: what does the action of agreeing affect? Often a
theme will be an entity of some kind, but in this case,
agree affects the next clause, that headed by create:
There are several possible
aspectual slot fillers which may be used.
In the case of agree, phase would be filled by
end because of the phrase reached agreement in the original text. Iteration would be single since the act of agreement happened only once, duration would be momentary, and the telicity slot
would be filled by "true," since the act of agreeing is complete:
Kawasaki Steel Corp. has reached agreement with Mitsubishi Corp. and
CIA Vale Do Rio Doce, Brazil's state-run mining corporation [...]
The only "verb-like" word in this text segment is state-run; one
option might be to decompose the word into the verb run.
However, there is a better way to represent this text than to
decompose the verb-like phrase into a head of its own.
Because state-run and mining are
properties,
they more approriately belong to a property frame. A property
frame is headed by an object, and has special slots suited to
describing
common object properties.
Because Mikrokosmos deals primarily with texts from the business
domain, it is quite common to come across parenthetical information
such as this description of a corporation. For that reason, special
slots have been developed for this purpose. They are:
To begin our property frame, we must first instantiate the company we are describing by giving it a name. We instantiate it by invoking the ontological
concept company and by marking it with an
arbitrary number.
To discover if there is an ontological concept available which will describe the concept we want to instantiate, we [describe what is done with Mikrokarat to search ontol. db].
We discover there is an ontological concept called company, so
we head the frame with this. We would also use the
activity/service slot and the ownership slot to represent state-run mining corporation.
The
first clause frame is now complete.
Second Clause:
...to set up a joint venture in Brazil possibly this summer...
However, if we look ahead to the next clause:
...to produce ferrosilicon...
we may want to ask ourselves, is possibly this summer meant to
modify set up a joint venture or to produce
ferrosilicon? One may make arguments for either choice--welcome to the
wonderful world of text meaning representation!
We will take the position that it is the setting up of
the joint venture which will take place possibly this summer instead
of the producing of ferrosilicon. This will require using the
attitude slot, which we will do when we finish the
frame for this clause.
However, there is more to the
"joint venture issue"
than meets the eye. Corpus analysis was conducted, and it was decided that because the meanings of the term are multifarious in Japanese as well as English, the best representation of the term would be in the form of a predicate head
create and a clearly non-language-specific theme
tie-up. Since setting-up a joint venture can be
glossed as creating a tie-up, we can double back and change the head to create. (This also necessitates changing the theme of the first clause to create, which is easily done.)
The second clause thus far becomes:
In this case, "possibly" reflects the belief of the speaker of the
text that the action of %produce_1 "this summer" is only possible--not
a sure thing, but potentially could happen. There is a filler for the
attitude slot called potential which can
express "possibly":
The slots to be filled for attitude follow; we have
filled in the type slot with potential; the
value, as indicated in the definition above, is "1":
Third Clause:
possibly this summer to produce ferrosilicon
The third clause frame is now complete.
Fourth Clause:
This is similar to the speech acts one finds in novels ("I told you
not to come," said Pierre.") and even in everyday conversation
(My dad said he's going to to kill me if I wreck the car
again.). Of course, they are very often found in newspaper articles,
which is where our sentence comes from.
By now, you should be able to tell what's coming:
Time Relations
The heads once again are:
A little chronological rearranging gives us the following
order of events:
Let's begin with agree:
Table of Contents:
Translating the Text
If the text is in a language other than English, it must be translated
into English before beginning the translation into Setting up the TMR Shell
[Need information from Ralf on how to set up a TMR shell]
Working Clause by Clause
It is easiest to begin transcribing your text into ILT if you first
separate the sentences of your text into clauses.
A clause, generally speaking, is a subject and finite verb (a verb
which changes tense with the subject). So there may be one, or
several, clauses in a sentence.
Kawasaki Steel Corp. has reached agreement with Mitsubishi Corp. and CIA
Vale Do Rio Doce, Brazil's state-run mining corporation, to set up a joint
venture in Brazil possibly this summer to produce ferrosilicon,
reported Nikkan Kogyo.
Kawasaki Steel Corp. has reached agreement with Mitsubishi
Corp. and CIA Vale Do Rio Doce, Brazil's state-run mining
corporation, to set up a joint venture in
Brazil possibly this summer to produce ferrosilicon, reported
Nikkan Kogyo.
Kawasaki Steel Corp. has reached agreement with Mitsubishi
Corp. and CIA Vale Do Rio Doce, Brazil's state-run mining
corporation
Selecting the
Head
A verb is a common head in a TMR frame. In the
case of our example, the verb phrase has reached agreement
presents itself as a possible head. However, it would be a mistake to
use the phrase "as is" for two reasons:
Filling the Slots for Case Roles
The next step to representing the first clause is to fill the
"case roles, which are the
arguments a
predicate can take. These include:
Clicking on any of these terms will take you to a definition and
examples of that term.
Kawasaki Steel Corp. has reached agreement [our head: agree]
with Mitsubishi Corp. and CIA Vale Do Rio Doce, Brazil's state-run
mining corporation
Agent
To find out if agree has an agent, ask "who or what is
agreeing?" The answer to that in this case is Kawasaki Steel Corp.,
Mitsubishi Corp. and CIA Vale Do Rio Doce. So we have three "agents"
of agree.
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
%company_1
name "Kawasaki Steel Corp."
Set Notation
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
member-type *company
cardinality 2
Since we know the members of the set, we can also use the slot
elements, which is used to name a set's members:
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
member-type *company
cardinality 2
elements (set "Mitsubishi Corp"
"Cia Vale Do Rio Doce")
Note that we have nested another set inside set_1; this is
perfectly acceptable with Mikrokosmos' notation.
Theme
A very common pairing of case roles finds agent and
theme occurring together.
Kawasaki Steel Corp. has reached agreement [agreed] [...]
to set up a joint venture [...]
So it is the action of setting up which is affected by
agree. This is a very common use of the theme
slot--filling it with an action that is "nested" within another
action:
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
member-type *company
cardinality 2
elements (set "Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %set-up_1
There are no other case roles which need to be invoked for the first
clause, but there are circumstantial roles which are part of every
predicate frame.
Finishing the Predicate Frame
Each head has time and
aspectual
slots because every action or state happens in the realm of time, in relation to other actions or states; in addition, since a text presupposes a speaker who bears an attitude toward what s/he is saying, often an
attitude
becomes part of what needs to be represented in a TMR. In the section of text we are looking at, attitude doesn't seem to be present as a
component of the text. In the frame for agree, then, the ontological concepts of the time and aspectual slots is
instantiated:
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
member-type *company
cardinality 2
elements (set "Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %set-up_1
time %time_1
aspect %aspect_1
The task then is to define each of these fillers separately.
Time
The filler for the time slot, %time_1
marks that
agree happened at a particular time, which will be discussed more
elsewhere. For the time being, it is enough to know that it acts as a
marker which allows the time at which agree happened to be
referred to.
Aspect
The aspect slot can be filled by any or all of the slots phase, iteration, duration and telicity. It represents the aspect of the head of the frame--in this case, agree.
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
aspect_2
phase end
iteration single
duration momentary
telicity true
The nearly complete frame then has become:
;; gloss: Kawasaki Steel Corp. has reached agreement [agreed] with
;; Mitsubishi Corp. and CIA Vale Do Rio Doce
%agree_1
agent %company_1
accompanier %set_1
member-type *company
cardinality 2
elements (set "Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %set-up_1
time %time_1
aspect %aspect_1
%time_1
absolute-time {recent past}
%aspect_1
phase end
iteration single
duration momentary
telicity yes
Property Frames
There is one portion of the text in the first clause which has been
left wholly unrepresented:
Examples of
company slot fillers
will make their use more clear. A
thorough listing of company slot fillers is also available.
;; Brazil's state-run mining corporation
%company_1
name "Cia Vale Do Rio Doce"
activity *mining
ownership $Brasil
The
symbols
used in this property frame have special uses and procedures for how they should be used.
Dividing the Clauses Correctly
We initially divided the second clause as:
Filling the slots for Case Roles
Because we needed to put the head of the second clause in as filler
for the theme slot of the previous clause, we already know that the head of the second clause is set up. It can be
hyphenated:
;; to set up a joint venture in Brazil
;; possibly this summer
%set-up_1
It is only necessary to look again at the definitions and examples of
each to determine which
case roles are needed.
Agent
Although the agent doesn't occur in the second clause, it can
be traced back to the first:
Asking, "who or what is setting up something?", we get the
answer--Kawasaki Steel Corp., Mitsubishi Corp. and CIA Vale Do Rio
Doce. So we instantiate another set:
;; gloss: to set up a joint venture in Brazil
;; possibly this summer
%set-up_1
agent %set_2
member-type *company
cardinality 3
elements (set "Kawasaki Steel Corp."
"Mitsubishi Corp"
"Cia Vale Do Rio Doce")
Theme
The entity whose state is affected by the action of setting-up
is clearly joint venture.
;; gloss: to set-up [create] a joint venture [tie-up] in Brazil
;; possibly this summer
%create_1
agent %set_2
member-type *company
cardinality 3
elements (set "Kawasaki Steel Corp."
"Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %tie-up_1
Finishing the Predicate Frame: Circumstantial Roles
Are there any other roles which should be represented in this clause
of the TMR? Look at the
circumstantial roles
and their definitions to discover if there are any appropriate to represent this clause.
Location
The circumstantial role location probably caught your eye; a
prepositional phrase such as in Brazil is a giveaway that the
location slot is needed:
;; gloss: to set-up [create] a joint venture [tie-up] in Brazil
;; possibly this summer
%create_1
agent %set_2
member-type *company
cardinality 3
elements (set "Kawasaki Steel Corp."
"Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %tie-up_1
location $Brazil
Purpose
To find out if there are additional roles, it is often necessary to look
ahead to the clauses which follow. They are:
It is clear from the text that the purpose of creating the
tie-up is in the third clause: to produce ferrosilison; in
fact, the insertion of the words "in order to" make this relation
clear--to set-up [create] a joint venture [tie-up] in Brazil possibly
this summer [in order to] produce ferrosilicon. So the purpose slot should be added, with the head of the next clause as its filler:
;; gloss: to set-up [create] a joint venture [tie-up] in Brazil
;; possibly this summer
%create_1
agent %set_2
member-type *company
cardinality 3
elements (set "Kawasaki Steel Corp."
"Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %tie-up_1
purpose %produce_1
location $Brazil
The
aspect,
time and attitude slots remain:
;; gloss: to set-up [create] a joint venture [tie-up] in Brazil
;; possibly this summer
%create_1
agent %set_2
member-type *company
cardinality 3
elements (set "Kawasaki Steel Corp."
"Mitsubishi Corp"
"Cia Vale Do Rio Doce")
theme %tie-up_1
purpose %produce_1
location $Brazil
aspect %aspect_2
time %time_2
Aspect and Time Slots
Since create is an achievement, its phase can be
described as end; it has a single
iteration, is of momentary
duration, and it is a complete action
(telicity)>:
%aspect_2
phase end
iteration single
duration momentary
telicity yes
The slot time, again, marks that create happened at
a particular time which can now be referred to, and which will be
discussed more elsewhere:
%time_2
absolute-time {future}
Attitude
This clause does need an attitude slot; the word
"possibly" is the trigger that invokes the slot. The
attitude
slot is where emotion and feeling present in the text is transcribed, either of
the "speaker" of the text or of a participant in the text.
POTENTIAL: The scale of the potential attitude goes
from "X is not possible" (value 0) through "X is possible" (value
1.0).
%attitude_4
type potential
value 1
scope
attribution
The scope slot is filled with the slots or heads
which are encompassed by the attitude. In this case, the
potential attitude scopes over %produce_1. The
attribution slot is filled by the speaker or other
text participant who possesses the attitude. In this case, it is
the three companies who possess the attitude, so we create another
set, which we can cross reference with %set_2 (which has the three
companies as its members). The finished attitude slot
then is:
attitude_1
type potential
value 1
scope %produce_1
attribution %set_3 ;; cross-referenced with %set_2
The second clause frame is now complete.
Filling the slots for Case Roles
Because we needed to put the head of the second clause in as filler
for the theme slot of the previous clause, we already know what
the head of the second clause is:
;; gloss: to produce ferrosilicon
%produce_1
Although it may seem that the agent of produce would be
the same set as in the previous clause, a closer look is in
order. The previous clause, joined to our third clause, yields:
to set up [create] a joint venture [tie-up]
So we see that it is the newly-created joint venture which will
produce ferrosilicon. We must instantiate a new "tie-up," and
coreference it with %tie-up_1:
;; to produce ferrosilicon
%produce_1
agent %tie-up_2 ;; coreferenced with %tie-up_1
Theme
This time, instead of filling the theme slot with another head,
we will fill it with an object--
ferrosilicon:
;; gloss: to produce ferrosilicon
%produce_1
agent %set_3 ;; coreferenced with %set_2
theme %ferrosilicon_1
Finishing the Predicate Frame
As always, representation of the head must always include
aspect and time information. This
time, these circumstantial roles are filled with less typical
information.
Aspect
The slots for aspect this time are filled with information
modified by the words surrounding the head, in particular, this
summer. Because
"production"
will be going all summer, aspect will be as follows:
%aspect_3
phase continuing
iteration multiple
duration prolonged
telicity no
Note the "continuing" phase and the "no" filler of
the telicity slot,
indicating that the action is not complete.
Time
Because of the phrase this summer, the time slot is
defined differently. One should discover the date of the text and
extrapolate the dates within which the act of production should
operate. In this case, say the text was dated January 14, 1986. The
time enterer into the time slot should be:
%time_3
absolute-time {010686-010986}
That is, the first day of June, 1986, through the first day of
September, 1986.
"Reporting" Verbs
The final clause is a
Kawasaki Steel Corp. has reached agreement with Mitsubishi Corp. and CIA
Vale Do Rio Doce, Brazil's state-run mining corporation, to set up a joint
venture in Brazil possibly this summer to produce ferrosilicon,
reported Nikkan Kogyo.
;; gloss: reported Nikkan Kogyo
%report_1
agent %company_2
In turn, %company_2 is defined as:
%company_2
name "Nikkan Kogyo"
The theme, in this case, is the first predicate of the
sentence; imagine that the fourth clause was at the beginning of the
sentence--Nikkan Kogyo reported that Kawasaki Steel Corp. has
reached agreement with... So we insert into the theme slot the first head of the sentence, agree:
;; gloss: reported Nikkan Kogyo
%report_1
agent %company_2
theme %agree_1
Next, we fill our time and aspect
slots (the attitude slot is not needed):
time %time_4
aspect %aspect_4
and define them:
%time_4
absolute-time {recent past}
%aspect_4
phase end
iteration single
duration momentary
telicity yes
The
fourth clause frame is now complete.
Time relations
are given at the end of the TMR; there is no reference to time relations within the TMR body. A time relation indicates if an event occurred at, after, or during another event.
Determining time relations between all of the events of a TMR can
be a difficult job; the typical TMR may have several complex sentences
and may need as many as 15 or 20 time relations (or more). In the
current case, we have only
five heads which require a time relation slot.
Using the Tools
[Insert text on how to do temp rels with the tools]
Arranging Events
The easiest way to decide how to relate the heads of each frame to
each other is to arrange them around the time of the publication of
the article, which is traditionally at
%time_0.
A little figuring gives us:
Notation for Time Relations
Now we will see why it was important to mark each of the clause frames
with a %time_n.
[Explanation here of how tools make this oh so easy...]
The following example shows
notation for time relations:
%temp-rel_n
type after
arg-1 %time_1
arg-2 %time_2
where type could be at, during, or
after.
;; gloss: "report" takes place after "agree."
%temp-rel_1
type after
arg-1 %time_4
arg-2 %time_1
Moving down the list, we have:
;; gloss: date of publication takes place after "report."
%temp-rel_1
type after
arg-1 %time_0
arg-2 %time_4
;; gloss: "set-up" [create] takes place after date of publication.
%temp-rel_1
type after
arg-1 %time_2
arg-2 %time_0
;; gloss: "produce" takes place after "set-up" [create].
%temp-rel_1
type after
arg-1 %time_3
arg-2 %time_2
As you can see, once you have arranged the events in "regular
English," plugging in the appropriate values and arg fillers is
much easier.
Using the Tools
[Since I expect that this will be a tool-intensive job because of its
mechanical nature, this section will be written later.]
Author: Donalee H. Attardo (dattard@crl.nmsu.edu)