All sixteen Spanish dialogs were marked up to indicate the categories and boundaries of discourse markers, taking roughly two hours each once operational definitions had been established. The sixteen English translations of these dialogs were also marked up. The 20 Spanish texts and their translations are in the process of being analyzed.
The operational definition of a discourse marker that we ended up using was ``an expression used to indicate to the addressee some conversational act which the speaker wishes some participant perform or some attitude of the speaker toward the direction of discourse''. For instance, in uttering ``mira'' in
porque yo, mira, a la un de la una a las seis yo trabajo ...
because I, look, at one from one to six I work...
the speaker wishes the addressee to attend closely to the information that the speaker is about to introduce.
As a point of departure for establishing subcategories of discourse markers, we have adopted a system of illocutionary acts proposed by Willis Edmonson (Spoken Discourse: a Model for Analysis, 1981). By this scheme, there are two broad subcategories of markers: discourse internal markers, such as ``mira'' above, which are used to control the flow of the conversational interaction and discourse external markers which are used to express one's beliefs, positions, intentions and so on regarding the activities or states of affairs being discussed. The former category includes markers which indicate such things as greeting, uptakes, continuing, concurrence, preannouncing, exclaiming, feedback-seeking, deferring, interrupting, leave- taking, and so on. The later includes markers that indicate the more usual illocutionary acts found in the literature such as suggesting, requesting, complaining, licensing, thanking, apologizing, telling, claiming, opining, excusing, justifying, condoning, etc. Definitions, following Edmonson's, such as:
``Interrupting:'' an expression used to inform the addressee that the speaker wishes to say something to the addressee at that point in time.
have been worked out for most of the categories mentioned as well as new categories, especially of the discourse external variety (See report on Discourse Markers for such definitions and examples). Others may yet be developed as needed.
As for the quantitative analysis, in the 6 dialogs used for
the analysis, we found 177 discourse markers. 96 were used with
discourse external functions, of which there were 6 suggests, 2
proposes, 4 willings, 1 license, 4 resolves, 76 tells, 2
justifys, and 1 sympathize. 81 were used with discourse internal
functions, of which there were 5 greets, 10 leavetakes, 30
interrupts, 3 exclaims, 12 accepts, 1 go-on, and 20 okays. While
the formal analysis of the 22 news articles and their
translations is incomplete, it is clear at this point that there
are virtually no examples of such markers. It appears,
therefore, that the results of the quantitative analysis will
show that discourse markers are much more frequent in dialog than
expository text and that the use of such markers as discourse
internal illocutions is unique to dialog. In addition, since
such markers normally have very different translations in English
depending on whether they are used as discourse marker or not,
they will have to be handled effectively and efficiently by
dialog translation systems.