Next: Example-Based MT
Up: A Full-Text Experiment in
Previous: A Full-Text Experiment in
The growth rate of theoretical studies of language structure and use stubbornly
remains higher than the improvement rate of large-scale applications. It has
been repeatedly proved that large-scale realistic NLP applications carry a
prohibitive price tag of large-scale, routine acquisition of knowledge about
language and about the world, collected in computational grammars, lexicons and
domain models. Strategically, there are several ways of dealing with this
problem:
- biting the bullet and going through a massive knowledge acquisition
effort, either general-purpose (e.g., the CYC project, Lenat et al., 1990) or
domain-specific (e.g., the KBMT-89 project, Goodman and Nirenburg, 1992);
- seeking ways of bringing down the price of knowledge acquisition by
studying ways of automatically or semi-automatically extracting relevant
information from machine-readable dictionaries (morphological, syntactic and
some semantic information, e.g., in the work of Wilks et al., 1990) or text
corpora (for instance, collocations, cf. Smadja, 1991); and
- seeking ways of avoiding the need for massive knowledge acquisition
by rejecting the entire established NLP paradigm in favor of knowledge-free,
linguistics- and AI-independent approaches.
This last option has been energetically promulgated in the important NLP
application of machine translation (MT). The two basic ``non-traditional''
approaches to MT are:
- statistical MT, which seeks to carry out translation based
on complex cooccurrence and distribution probability calculations over very
large aligned bilingual text corpora; and
- a more modest approach, called example-based MT, which is the topic of
this paper.
Steve Beale
Tue Oct 1 12:14:38 MDT 1996