Introduction to Text Meaning Representations

MIKROKOSMOS



Introduction to Text Meaning Representations (TMRs)

A text meaning representation (TMR) is a language-neutral description (an interlingua) of the meaning conveyed in a natural language text, and is derived by syntactic, semantic, and pragmatic analysis of the text. Because the TMR is intended to be language neutral, it is also deliberately syntax neutral, and avoids using terminology like clause, proposition, tense, etc., which are associated more closely with the syntactic structure of a particular language. In addition to providing information about the lexical-semantic dependencies in the text, the TMR represents stylistic factors, discourse relations, speaker attitudes, and other pragmatic factors present in the discourse structure. In doing so, the TMR captures not only the meaning of individual elements in the text, but also the relations between those elements, and captures both propositional and nonpropositional components of textual meaning.

TMR Structure

The TMR is divided into seven sections which combine to convey the overall meaning of the original text. The Table of Contents provides a summary of the heads (roughly, the predications), speech acts, attitudes, relations, focus, and stylistic factors found in the text, and is followed by a Speech Act section where the type and scope of each speech act, the speaker/writer, hearer/reader, time of the speaking/writing, etc. are given.

The results of analyzing the individual sentences in a natural language text are represented in the TMR body. A clause in the natural language is typically represented by an EVENT or PROPERTY concept from the ontology; this concept is referred to as an interlingual head in the TMR, and contains a number of modifying roles (such as case and circumstantial roles) that further define it.

The TMR also contains sections in which information on Attitudes, Domain Relations, and Temporal Relations are conveyed. In the final section of the TMR, the Coreference section, separate occurrences of the same object or event are matched up.

TMR Notation

The results of analysis of an input text are represented in a formal, frame-based language. The meanings of most open-class lexical units are represented by instantiating, combining and constraining concepts available in the ontology. However, the intent of a text cannot fully be captured by instantiating ontological concepts; information about pragmatic and discourse related phenomena must be represented, and relations between components of meaning must also be expressed. To facilitate this, the TMR language contains special notation for representing attitudes, relations, speech acts, time, quantities, rates, and sets.