Introduction to Text Meaning Representations
MIKROKOSMOS
Introduction to Text Meaning Representations (TMRs)
A text meaning representation (TMR) is a language-neutral description
(an
interlingua)
of the meaning conveyed in a natural language text, and is derived by
syntactic,
semantic,
and pragmatic analysis of the text. Because the TMR is intended to be language neutral, it is also deliberately syntax neutral, and avoids using terminology like clause, proposition, tense, etc., which are associated more closely with the syntactic structure of a particular language. In addition to providing information about the lexical-semantic dependencies in the text, the TMR represents
stylistic factors,
discourse relations, speaker
attitudes,
and other pragmatic factors present in the discourse structure. In doing so, the TMR captures not only the meaning of individual elements in the text,
but also the relations between those elements, and captures both propositional and nonpropositional components of textual meaning.
TMR Structure
The
TMR
is divided into seven sections which combine to convey the overall meaning of the original text. The
Table of Contents
provides a summary of the heads (roughly, the predications),
speech acts, attitudes, relations,
focus,
and stylistic factors found in the text, and is followed by a
Speech Act section
where the type and scope of each speech act, the speaker/writer, hearer/reader, time of the speaking/writing, etc. are given.
The results of analyzing the individual sentences in a natural language text are represented in the TMR body. A clause in the natural language is typically represented by an EVENT or
PROPERTY concept from the
ontology; this
concept
is referred to as an interlingual head in the TMR, and contains a number of modifying roles (such as
case and circumstantial roles) that further define it.
The TMR also contains
sections
in which information on Attitudes, Domain Relations, and Temporal Relations are conveyed. In the final section of the TMR, the
Coreference section, separate occurrences of the same object or event are matched up.
TMR Notation
The results of analysis of an input text are represented in a formal, frame-based language. The meanings of most open-class lexical units are represented by
instantiating,
combining and constraining concepts available in the ontology. However, the intent of a text cannot fully be captured by instantiating ontological concepts; information about pragmatic and discourse related phenomena must be represented, and relations between components of meaning must also be expressed. To facilitate this, the TMR language contains special notation for representing
attitudes, relations, speech acts,
time,
quantities,
rates, and
sets.