COLING-2000 Workshop:
Using Toolsets and Architectures To Build NLP Systems

Centre Universitaire, Luxembourg, Saturday 5 August 2000

Call for Participation

Background

The purpose of the workshop is to present the state-of-the-art on NLP toolsets and workbenches that can be used to develop multilingual and/or multi-applications NLP components and systems. Many toolsets have been developed to support the implementation of single NLP components (taggers, parsers, generators, dictionaries) or complete Natural Language Processing applications (Information Extraction systems, Machine Translation systems). These tools aim at facilitating and lowering the cost of building NLP systems. Since the tools themselves are often complex pieces of software, they require a significant amount of effort to be developed and maintained in the first place. Is this effort worth the trouble? It is to be noted that NLP toolsets have often been originally developed for implementing a single component or application. In this case, why not build the NLP system using a general programming language such as Lisp or Prolog? There can be at least two answers. First, for pure efficiency issues (speed and space), it is often preferable to build a parameterized algorithm operating on a uniform data structure (e.g., a phrase-structure parser). Second, it is harder, and often impossible, to develop, debug and maintain a large NLP system directly written in a general programming language.

It has been the experience of many users that a given toolset is quite often unusable outside its environment: the toolset can be too restricted in its purpose (e.g. an MT toolset that cannot be used for building a grammar checker), too complex to use, or even too difficult to install. There have been, in particular in the US under the Tipster program, efforts to promote instead common architectures for a given set of applications (primarily IR and IE in Tipster; see also the Galaxy architecture of the DARPA Communicator project). Several software environments have been built around this flexible concept, which is closer to current trends in main stream software engineering.

The workshop aims at providing a picture of the current problems faced by developers and users of toolsets, and future directions for the development and use of NLP toolsets. It includes reports of actual experiences in the use of toolsets as well as presentation of toolsets.

Audience

Researchers and practitioners in Language Engineering, users and developers of tools and toolsets. Please note that workshop participants are required to register at http://www.coling.org/reg.html.

Program

This one-day workshop includes ten presentation periods which are divided into 20 minutes presentations followed by 10 minutes reserved for exchanges. We encourage the authors to focus on the salient points of their presentation and identify possible controversial positions. There will be ample time set aside for informal and panel discussions and audience participation.
9:30 - 9:45 Opening
9:45 - 10:15 Hamish Cunningham, Diana Maynard, Kalina Bontcheva, Valentin Tablan, Yorick Wilks Experience Using GATE for NLP R&D
10:15 - 10:45 Fredrik Olsson, Björn Gambäck Composing a General-Purpose Toolbox for Swedish
10:45- 11:15 Kalina Bontcheva, Hennie Brugman, Hamish Cunningham, Albert Russel, Peter Wittenburg An Experiment in Unifying Audio-Visual and Textual Infrastructures for Language Processing Research and Development
11:15 - 11:30 Coffee break
11:30 - 12:00 Jan Amtrup, Rémi Zajac A Modular Toolkit for Machine Translation Based on Layered Charts
12:00 - 12:30 Jan Daciuk Finite State Tools for Natural Language Processing
12:30 - 14:00 Lunch
14:00 - 14:30 Nancy Ide The XML Framework and Its Implications for the Development of Natural Language Processing Tools
14:30 - 15:00 Jill Burstein, Daniel Marcu Benefits of Modularity in an Automated Essay Scoring System
15:00 - 15:30 Vincent Pautret A Rational Agent for the Modeling of a Semantic Model
15:30 - 15:45 Coffee break
15:45 - 16:15 Matthias Denecke An Integrated Development Environment for Spoken Dialogue Systems
16:15 - 16:45 Anke Kölzer Diamod - a Tool for Modeling Dialogue Applications

Abstracts

Organizing Committee

Of Related Interest