Proposal: Proposed Work (III): Discourse Application



next up previous
Next: Evaluation

Up: Description of Proposed Work

Previous: Proposed Work (II): Lexical Application


3.1 POV Tracking

The general problem we address is tracking what Uspensky [62] calls the psychological point of view in narrative. Let subjective sentences[4] be ones that present the private states of agents (emotions, perceptions, propositional attitudes). The problem of tracking the psychological point of view (hereafter, POV) is determining, for each sentence, whether or not the sentence is subjective and, if it is, identifying the agent(s) whose POV is taken (the subjective agent). What makes this a non-trivial, context-dependent problem is the frequent occurrence of subjective sentences that mention neither the agent nor the state type (belief, intention, etc.). Consider, for instance: (a.1) John was aware that Mary was incompetent. (a.2) She couldn't even log into the computer. Sentence (a.2) is easily taken to present a private state of John's, but it leaves the agent and the type of private state entirely implicit. Similarly, consider the following discourse fragment from a non-fictional book about WWII (Irving 1981, p.7):

(b.1) The young Americans, on their part, had never seen anything like London. (b.2) The buildings were low and gnarled and barnacled with sooty decorations. (b.3) The policemen wore odd helmets, office workers wore bowlers, and passersby wore blank expressions---no eye contact. (b.4) What shocked them most, perhaps, these kings of the American road, was to find themselves suddenly impotent in the flow of wrong-way traffic.

All of these sentences present private states of the Americans. It is to the Americans that the police helmets are odd, for example. But in (b.2-3), the agents and state types are again implicit.

It is necessary to determine the current POV in narrative in order to distinguish the beliefs of agents from the facts of the narrative, to correctly attribute beliefs and other attitudes to their sources, to recognize agents' intentions, and to understand the discourse relations among sentences. Wiebe has developed an algorithm for tracking POV that decides, for each sentence, whether or not it is subjective and, if it is, identifies the subjective agent [70],[71],[68],[69],[67],[66],[15],[16],[72]. This work departs from much traditional AI discourse work, in that the algorithm does not attempt to perform knowledge-intensive reasoning. The algorithm is based on regularities (i.e., combinations of various features) found by extensive examination of naturally-occurring text, in the ways that writers manipulate point of view. Although the algorithm is based on observed regularities, they were not acquired in a formal empirical study. The current proposed research is a logical next step in this work: using statistical techniques to find the significant correlations in the data.

Subjective sentences are ubiquitous in narrative, and there exist many closely related discourse/pragmatic phenomena. In news articles, is the writer presenting what is to be taken as fact, or is s/he presenting the opinion of an agent mentioned in the article? Or, consider discussing someone else's work, say Smith's, in a research paper or text book. After an initial reference to Smith's work, you may go on to describe his or her theory without explicitly saying in each sentence that you are doing so (with a locution such as ``In Smith's theory'' or ``According to Smith''). Further, although not necessarily concerned with private states per se, the same sort of discourse structures arise with what Fauconnier \cite{fauconnier} calls ``space builders.'' Just as a subjective sentence can begin a discourse segment implicitly presenting an agent's POV, an adverbial such as ``in 1969'' can begin a discourse segment in which subsequent sentences are understood to refer to events that occurred in 1969, even though the date is not subsequently mentioned. An NLP system must be able to recognize such discourse phenomena in order to recover information implicitly communicated in the discourse.

The problem is related to other discourse problems, in that it involves discourse segmentation[30] and a task akin to reference resolution (identifying the subjective agent). As discussed below, we plan to address a somewhat less fine-grained problem in the interests of feasibility; however, the new problem retains segmentation and addresses interactions between POV tracking and reference resolution.

3.2 The Problem Addressed

The problem addressed here is a broader segmentation of the text. Rather than deciding upon the POV of each sentence, the task is to segment the text into blocks, each possibly containing both objective and subjective sentences, such that all subjective sentences in a block have the same subjective agent. We can view this task as identifying critical changes, where the text turns from focusing on one agent to focusing on another.

Ideally, we would also like to identify the subjective agent of each block. However, doing so in every case would require unrestricted, general reference resolution, since the subjective agent has to be recovered from the text. Rather than attempting to perform reference resolution automatically in a preprocessing phase, we have elected to make the ultimate goals of the system a segmentation of the text into blocks and, within each block, a judgment as to whether or not the subject of each sentence refers to the subjective agent (whoever he or she is). Each noun phrase successfully judged to be a reference to the subjective agent provides information about him or her and, because all of these noun phrases are co-referential, each contributes information about the referents of all the others as well. For example, ``he'' tells us the individual is male and ``John'' identifies his name.

The goals stated above are challenging ones. Thus, we have broken the problem into five steps. The results of each step are available as input to the subsequent steps, and work can proceed on the more challenging steps only once results have been obtained for the simpler steps. The first two phases do not involve POV-segmentation, but instead perform selected syntactic and semantic disambiguation. The latter three are the ones that address POV-segmentation, making successively finer classifications with respect to the identity of the subjective agent. These are probabilistic classifiers (we shall call these ``POV classifiers''). The second of the preprocessing phases is also a probabilistic classifier, targeting a key preprocessing requirement of the existing POV algorithm.

Section 3.3 presents our plans for preprocessing using existing software. Section 3.4 specifies the four classification problems to be addressed, and specifies which techniques out of those described earlier in the proposal will be used to develop the classifiers. We conclude in section 3.5 with examples of potential non-classification variables and discussion.

3.3 Preprocessing

Recall that the values of the non-classification variables included in the model must be known. This is a particular problem facing automated approaches to high-level discourse processing tasks, which often build upon the results of prior syntactic and semantic analysis. Fortunately, many of the features used in the existing POV algorithm are syntactic and semantic distinctions that can feasibly be at least approximated automatically; these distinctions amount to much less than a full parse or a full representation of the literal meaning of the sentence.

A preprocessing component will be developed to automatically determine the values of a subset of the features used in the existing POV algorithm, as well as others we hypothesize are also relevant (e.g., POS, number, and case). All of these features will be available as candidate non-classification variables for inclusion in any of the classifiers. The component will consist of off-the-shelf software: a POS tagger, a chunker, a name recognizer, a morphological analyzer, and a rudimentary lexicon.

3.4 The Classifiers

Developing a probabilistic classifier requires, obviously, that the problem be cast as a classification problem: choosing the objects to be classified and a finite set of mutually exclusive classes for these objects, to serve as the values of the classification variable. Note that the identity of the subjective agent cannot be directly represented as a classification variable. Before a text is processed, the possible subjective agents cannot be known, so cannot be specified as the possible values of the classification variable. The same issue would arise in casting anaphora resolution as a classification problem.

The problems we address are as follows.

Preprocessing classification problem.

The objects are the main clauses of sentences. The classification variable is the type of state of affairs that the main clause of a sentence is about. The values are four broad classes drawn from the existing POV algorithm [66], [70]: private states (e.g., ``believe'', ``hope''), non-private states (e.g., ``be'', ``own''), private actions (e.g., ``sigh''), and non-private actions (e.g., ``shoot'').

First POV classification problem.

The objects are sentences. The classification variable has two values: either that the sentence continues the current block ( continues) or begins a new one (new).

Second POV classification problem.

The objects are those sentences that begin a new block (i.e., those identified as new by POV classifier one). The classification variable values are of the form (sentence, syntacticFunction), meaning that the subjective agent is the referent of the noun phrase filling the syntactic function syntacticFunction in sentence sentence, e.g., the subject of the current sentence or the subject of the previous sentence.

Third POV classification problem.

The objects are the subject noun phrases of the main clauses of the sentences. The classification-variable values are whether or not the noun phrase refers to the subjective agent of the block that it is in.

For each classifier, the results of all previous phases are available as potential non-classification variables. The particular ones to include will be selected as in the original method for developing classifiers (section 1.1.2). A decomposable model will be selected to express the interdependencies among variables; the methods for selecting the form of the model and estimating the parameters will be the ones proposed in sections 1.2.1 and 1.2.2}.

Given the contextual nature of discourse problems, one kind of non-classification variable important for discourse processing concerns the classifications of previously occurring ambiguous objects (for example, in POV classifier one, whether the tag of the previous sentence is new or continues). If feasible, i.e., if the model is not too complex, such interdependent ambiguities will be resolved using the techniques proposed in section 1.2.3. Otherwise, tags will be assigned sequentially; that is, in assigning a class to the current ambiguous object, the classifications of previously occurring objects will be treated as known.

3.5 Examples and Discussion

The following are examples of variables that will be candidates for inclusion in the various classifiers (more precisely, these are interpretations of such variables). Most are based on features included in Wiebe's POV algorithm; we expect to find, based on experience with that algorithm, that they are indicative of POV. Because the system is to be fully automatic, the values of these variables may not be determined with 100\% accuracy. We anticipate that, because our method is probabilistic, such inaccuracies will only be a source of noise that serves to reduce the probabilities assigned to important correlations without obscuring the correct classifications.

1. Variables corresponding to properties of sentences.

(a) whether or not any of a prespecified list of subjective lexical items appears in the sentence (e.g., ``amazingly'' and ``surely''), and whether or not any such items appear in the main clause.

(b) the type of state of affairs that the main clause is about.

(c) whether or not the sentence is at the beginning of a paragraph, chapter, or section break.

(d) the form of the subject noun phrase (pronoun, indefinite description, definite description, and proper name),

(e) the gender, number, and case of noun phases, as feasible.

2. Variables concerning classifications of previously occurring ambiguities.

(a) how large the current block is so far (one sentence, or a number above a threshold value);

(b) the average block size so far (above or below a threshold value);

(c) the type of state of affairs that the majority of the sentences in the current block are about;

(d) whether or not the previous subject noun phrase refers to the subjective agent of the block.

While we expect that the values of each of these variables will be relevant to assigning POV classifications, we do not suppose that the classification problem can be solved by considering each such variable in isolation. Rather, we expect that the problem will require considering sets of variables taken in combination. In fact, empirical investigation of the interactions among variables is a feature of this research that is an important end in itself. Probabilistic models, formulated as in this work, characterize the structure of the data; the form of the model identifies interdependencies among important variables, and the parameter estimates provide information about the relationships among the individual values of these variables.

With the third POV classifier (in which subject noun phrases are classified as referring to the subjective agent or not), we most directly investigate interactions between POV and reference. In general, the subjective agent seems to enjoy a distinguished level of focus. Evidence are cases such as (b.4) in section 3.1. Even without the phrase ``these kings of the American road,'' it would be easy to interpret the first pronoun ``them'' as referring to the Americans, even though there are numerous competing discourse entities mentioned more recently (the policemen, the passers-by, etc.) [30]. However, the subjective agent is often referred to non-pronominally as well. Our approach to developing models promises to identify correlations such as those between form of reference and POV.



next up previous
Next: Evaluation

Up: Description of Proposed Work

Previous: Proposed Work (II): Lexical Application