MIKROKOSMOS



TMR Notes


Last-modified: 13 January 1995

(1) Speech Acts

Although there are various types of speech acts (questions, requests, statements, etc.), the statement speech act is the type seen most often in the joint venture corpus. We use "statement_1" as the default speech act covering the whole article. This represents the author's narration of the facts. Additional speech acts, such as quotations, may also be present in an article. Note, however, that we have consciously chosen not to characterize the usual "announce" events in a newpaper articles as a speech act.


(2) Proposition

The proposition is the fundamental organizing element in the body of the TMR. The information contained in the proposition is given in a slot-filler format. The slots HEAD, TIME, ASPECT, and POLARITY are required, but those of ATTITUDE, MODALITY, and RELATION are added as necessary. The filler of a slot is an ontological concept name suffixed by an instance number, so that in any given text each occurrence of a concept has a unique number. The proposition-level representation for the clause "Ajinomoto decided to underwrite..." would be:

%proposition_1
	head		%decide_1
	time		%time_1
	aspect   	%aspect_1
	polarity	&positive

%decide_1
	agent		%company_1	
	theme		%underwrite_1

%company_1
	name		$Ajinomoto
		

(3) Head

The head represents the central notion of the proposition, and can be any ontological concept. Its information is conveyed in slot-filler format. When the head is an event, it has case roles (slots) such as AGENT, THEME, COTHEME, ACCOMPANIER, BENEFICIARY, PURPOSE, MANNER, LOCATION, FOCUS, etc., as needed to convey the meaning of the original text. When the head is an object, it has property slots appropriate for that concept. See the TMR Outline for a list of pertinent case roles; additional roles can be defined for particular types of events (e.g. "produce" might have a RATE slot for rate of production (see Rate , below). See also notes below on fillers of slots on an event: Company, Country, and Sets.

When the head is an object, a state is implied . Thus, the sentence "There is a big, black, wooden house." can be represented as:

%proposition_1	
   	  head		%house_1
   	  time		%time_1
   	  aspect	%aspect_1
   	  polarity	positive

%house_1
    	 size	.8
    	 color	&black
    	 material	%%wood


(4) Stylistic Factors

The reporting style of the joint ventures texts typically is neutral, so stylistic factors such as formality, respect, etc. are set at a default of 0.5 (neither formal nor informal, neither high nor low level of respect, etc.). Values can range from 0 to 1.


Symbols

The following general symbols are used in TMRs (see also Time and Quantifier Relations):

%
instantiated ontological concept (%company)
$
named instance ($"Ajinomoto Dannon", $Japan)
&
symbolic constant (e.g. &red &blue)
%%
concept in the ontology (%%sales)
*X*
special variable (e.g. *author* *unknown*)
;;
comment (text following the double semicolon is not treated as part of the TMR)
~
"approximately", used with numbers (~200 machines)


(6) Attitudes

Attitudes are used to reflect the way elements in the text are conveyed by an intelligent agent (typically the speaker/writer of the text). Definitions and examples of the attitudes are given in "Attitude - Definitions and Examples". Attitudes that are not heads are broken out at the end of the TMR, with a pointer included in the TMR body.

Attitudes have the following required slots: TYPE, ATTRIBUTED-TO, SCOPE, TIME, and VALUE. The TYPE slot is filled with either EVALUATIVE or SALIENCE. ATTRIBUTED-TO is filled with the agent or entity who possesses the attitude. SCOPE identifies the segments of the TMR (and corresponding text) covered by the attitude, and TIME is the time at which the attitude holds. VALUE is assigned on a scale of 0 to 1.0, with 1.0 being the positive end of the scale. For example, for the evaluative attitude "It is the best", the value would be 1.0, whereas "It is the worst" would have a value of 0. Between these values would fall "It is all right" (value somewhere between .5 and .8) and "It is not good" (value less than .3). The intermediate values are somewhat arbitrarily, set for the present, with ranges, including less than or greater than, allowing for approximate values to be assigned.

"GM has a high regard for Toyota business practices." would be represented as follows:

%attitude_1
	type		evaluative
	attributed-to 	%company_1	;GM
	scope		%practice_1	;Toyota business practices
	time		%time_1
	value		>0.8


(7) Modalities

Modalities are represented in the same way as attitudes. They have the same slots, with the TYPE slot containing one of the following values: EPISTEMIC, DEONTIC, VOLITIVE, or POTENTIAL. With the exception of the modality on the %statement_1 (which is broken out within the statement section), modalities, like attitudes, are broken out at the end of the TMR, with a pointer in the TMR body. (see examples in JJV0002 %statement_1, %modality_1; %modality_5; %modality_6).

"A company announced..." would be represented as:

%modality_1
	type			epistemic
	attributed-to 		%company_1
	scope			%announce_1
	time			%time_1
	value			1.0

Attitudes and modalities can be combined to capture a particular meaning in a text. For example, in "There is also concern that ... licensing and know-how disputes will occur", an epistemic modality reflects the belief that the situation may occur, while an evaluative attitude captures the less than positive feeling about the event taking place.

%proposition_4
	head		%occur_1
	time 		%time_4
	aspect		%aspect_4
	polarity	positive
	modality	%modality_4
 	attitude	%attitude_1

%modality_4
     	type            *epistemic
	attributed-to   *author*
     	scope           %occur_1
     	time            %time_16
     	value           >0.5

%attitude_1
	type            *evaluative
	attributed-to   *author*
	scope           %occur_1
	time            %time_16
	value           <0.4


(8) Temporal Relations

Temporal relations are given at the end of the TMR, following the attitudes section; there is no reference to temporal relations within the TMR body. Temporal relations indicate the relative timing of one event in the text in relation to another.

Temporal relations must have the following slots: TYPE, ARG_1, and ARG_2, where fillers for ARG_1 and ARG_2 are times (e.g. %time_1), and TYPE indicates the relation between the two times, filled by one of the values AT, AFTER, or DURING. For example, "the event whose time is %time_2 occurred after the event whose time is %time_3" would be represented as:


  %temp-rel_1
        type  after
           arg_1  %time_2
           arg_2  %time_3

A temporal relation also may have a VALUE slot, to indicate the relative distance between two times. If %time_2 occurred just after %time_1, the temporal relation would look like this:


  %temp-rel_2
        type  after
           arg_1  %time_2
           arg_2  %time_1
           value  > 0.2


(9) Focus

At present focus is used in three ways:

  1. As a point of special attention. This is shown in the focus under %statement_1, where the focus is the main event in the article (see example in JJV0002: statement_1, %focus_1).

  2. To provide emphasis in the TMR to reflect emphasis in the original text. For example, when the text says, "the two companies", we put a focus on the "two" (see example in JJV0403: %strengthen_1, %focus_3).

  3. To reflect a passive construction, focus is placed on the theme in a head (see example in JJV 0403: %designate_1, %focus_2)


(10) Coreferences

All instances of an object or an event are given a one-up number (%company_1, %company_2, %company_3, etc.), and those instances that refer to the same object or event are coreferenced at the end of the TMR (e.g. if %company_1 and %company_3 both refer to Tokyo Bank, they would be coreferenced at the end of the TMR). Coreferencing is facilitated if margin notes are inserted wherever a coreference occurs. (see example in JJV0375: %coreference_3).


(11) Domain Relations

Domain relations represent connections between events, states or objects in the text. These connections can be quite general, scoping over large portions of text, or more specific, and limited in scope (e.g. linking consecutive heads). Definitions and examples of domain relations are given in Appendix IV. Domain relations are listed at the end of the TMR, following the temporal relations.

Domain relations have the following slots: TYPE, ARG_1, and ARG_2, where TYPE is filled with the appropriate domain relation from the "Definitions of Domain Relations" paper (Appendix IV), and ARG_1 and ARG_2 are filled with the TMR elements between which the relation exists. For example, "Companies A and B are going to create a tie-up (%create_1). In addition, companies C and D are going to create a tie-up (create_2)." would be represented as:

	%domain-rel_1
	     type	addition
	     	arg_1	%create_1
	     	arg_2	%create_2

A domain relation may also have a value slot. For example, a value slot on a comparison domain relation would indicate degree of similarity. "The (video-recorded) image is similar to the photograph." would be represented as:

	%domain-rel_1
	     type	comparison
	     	arg_1	%image_1
	     	arg_2	%photograph_1
	     	value	0.7


(12) Textual Relations

No textual relations were found in the first 10 JJV TMRs, so we have not refined a method of representing these.


(13) Time

The time slot under each head is given a one-up number (e.g. %time_0, %time _1, etc.). If more information is known, such as the actual date, the TIME slot can be further broken out (see JJV0029: statement_1, %time_0). It is also possible to represent a period of time (see JJV0108: %issue_1, %time_3). The possible slots for TIME are as follows: AT, START, END, DURATION, and UNIT. Possible fillers for these slots would be:

at/on YYMMDD

>=YYMMDD
on or after YYMMDD

on or before YYMMDD

>YYMMDD
after YYMMDD

<YYMMDD
before YYMMDD


(14) Aspect

While verb tense can give clues to ASPECT, tense will be handled largely by the TIME slot. Aspect concerns only the nature of the verb. Aspect can have any or all of the following slots: PHASE, ITERATION, DURATION, TELICITY. Leave a slot blank if you are unable to determine its proper filler. These are the slots with their possible fillers:

phase
begin, continue, end

iteration
single, multiple

duration
momentary, prolonged

telicity
true, false
An achievement (reach, sign, announce) would be represented as:

  phase      end
  iteration  single
  duration   momentary
  telicity	 true

An accomplishment (build, design) would be:

  phase      end
  iteration
  duration   prolonged
  telicity 	 true
Aspect for a process (manufacture, produce) would be represented as:

  phase      continue
  iteration  multiple
  duration   prolonged
  telicity	 false


(15) Polarity

Polarity is `positive' for a clause stated in the affirmative, and 'negative' for a clause stated in the negative.


Case Roles and Circumstantial Roles

Case roles are the arguments or typical roles that a predicate can take; circumstantial roles relate events to more circumstantial pieces of information that describe them, such as location, time, etc. Both case roles and circumstantial roles will be defined as properties (i.e. as relations) in the ontology, and will appear as properties of events in the TMR. Typically these roles appear as slots under a TMR head. (see Case Roles and Circumstantial Roles for Predicates).

(16) Agent - In Japanese, the agent frequently is not repeated across related sentences; the reader generally keeps track of the agent. In TMRs, the agent (or other filler) is migrated within a sentence, but not across sentences. For example, if there are multiple heads in a sentence, and the agent is the same for more than one head, the agent is instantiated for all the relevant heads, even if it is not specifically stated; if the agent (or other filler is not restated in a new sentence, it is represented as *unknown*.


(17) Cotheme

It was decided that theme/co-theme was a reasonable way of representing verbs such as "designate", "name", "elect", "sell", etc.

For example, "Company 1 will sell structures to the Japanese market as sports and event facilities" would be represented as follows:


%sell_1
    agent	%company_1
    theme	%structure_1
    co-theme	%facility_1
    recipient	%market_1
    time	%time_8
    aspect	%aspect_8
    polarity	positive

Agent

In Japanese and Spanish, the agent frequently is not repeated across related sentences; the reader generally keeps track of the agent. In TMRs, the agent (or other filler) is migrated within a sentence, and across sentences, where appropriate. For example, if there are multiple heads in a sentence, and the agent is the same for more than one head, the agent is instantiated for all the relevant heads, even if it is not specifically stated. If the same agent (or other filler) is left unstated in a subsequent sentence, it is represented by the next instance of the same agent/filler, and is followed by a comment ";inferred". This instance would then be coreferenced with the previous instance(s) of the same agent. If the agent is ambiguous, the AGENT slot should be filled with "*unknown*".


Case Roles and Circumstantial Roles

Case roles are the arguments or typical roles that a predicate can take; circumstantial roles relate events to more circumstantial pieces of information that describe them, such as location, time, etc. Both case roles and circumstantial roles will be defined as properties (i.e. as relations) in the ontology, and will appear as properties of events in the TMR. Typically these roles appear as slots under a TMR head. (see: Case Roles and Circumstantial Roles for Predicates).


Other Notes

(18)Rate - A rate is represented by UNIT, INTERVAL, and QUANTITY (see also (20) Amount). For example, a rate of 100,000 tons per year would be shown as follows:

%rate_1
unit *ton interval *year quantity 100,000


(19) Company - 'Company` is often a filler for case roles of an event in the joint venture domain (e.g. slots such as AGENT, ACCOMPANIER, BENEFICIARY, EXPERIENCER, SOURCE, and DESTINATION.)

A company can have both a NAME and an ALIAS slot. If a company name is mentioned, and then a shortened version is given within the same sentence, fill in both the NAME and ALIAS slots; if a shortened name appears in a different sentence, instantiate a new company, fill in the shortened form in the NAME slot, and coreference at the end of the TMR.


(20) Country - 'Country' is a valid filler for the slots listed under (17) Company, above, as well as LOCATION.


(21) Sets - Sets are used to represent a broad range of phenomena such as: definite and indefinite sets, partially enumerated sets, relations between sets and subsets, ordinals, superlatives, and existentials.


(22) Quantifier Relations - Quantifier relations are relations between numerical quantities. The operators for quantifier relations are as follows:

=
equality

<
less than

=<
less than or equal to

>
greater than

>=
greater than or equal to

integer-integer
range

mult
multiply

sum
sum

For example:

a. "Mitsubishi made the same number of TVs [%amount_1] this year as last year [%amount_2]"

%quant-rel_1
       type    =
         arg_1    %amount_1
         arg_2    %amount_2

b. "35 percent of 500-600 million yen [%amount_1]"

quant-rel_2
       type   mult
          arg_1  0.35
          arg_2  %amount_1


(23) Amount - Amounts are expressed in terms of UNIT and QUANTITY. For example, 14 billion Japanese yen would be represented as:

%money_1
   quantity     14 billion
   unit         JPY          ;Japanese yen


Footnotes

(1)
The Japanese osore means "concern, danger, fear" and has a stronger negative connotation than the English translation, hence the negative value on the evaluative attitude.