This is the final report of the first year of the Artwork project. It contains a chapter devoted to each of the major three phases of the project.
Chapter 1 contains the final report on Phase One of the first year of the Artwork project. The initial phase of the Artwork feasibility study was to perform a comparative analysis of language usage in spoken dialog, the modality of speech-to-speech translation, and in written text, the modality assumed for the development of most current MT systems including the CRL's. The objective was to establish a range of structural, functional and interpersonal characteristics for each modality which can be taken as processing requirements, to see which are associated with both modalities and which are associated with just one of the modalities, and, finally, to identify the relative importance of the different characteristics to successfully processing utterances produced under each modality.
The tasks involved in carrying out the comparative analysis included: (1) the collection and transcription of the spoken data for the analysis (approximately an hour of taped interviews) and the preparation of the written data (EFE newswire articles), (2) the quantitative analysis of the transcriptions and texts with respect to a number of relevant linguistic characteristics, and (3) the comparison of the resultant analyses for differences between dialogs and text. In addition, approximately midway through the project it was decided that (4) the impact of protocol on dialog was to be analyzed. Chapter 4 contains the results of tasks 1, 2, 3. Chapter 3, which contains the results of Phase Three of the project, contains the results of task 4.
Chapter 2 contains the final report on Phase Two of the project. The goal of this phase was to identify a number of translation problems that appear to be peculiar to dialog translation as opposed to text translation and to survey current MT system capabilities for dealing with these problems. The chapter is therefore divided into two part: a survey of dialog translation problems and a survey of current MT system techniques for dealing with them.
Chapter 3 contains the final report on Phase Three of the project. According to a revised plan of research, which was agreed to during the fifth bi-monthly reporting period, Phase Three of the project consists of (1) a study comparing dialogs collected within a push-to-talk protocol with dialogs collected within a cross-talk protocol and (2) identifying approaches to resolving ellipsis and anaphora in scheduling dialogs. The chapter presents the results of the protocol study, and then addresses ellipsis and anaphora resolution.