GraphLing

Summary of Research




GraphLing -- Probabilistic graphical models for NLP: tools and applications

Rebecca Bruce and Janyce Wiebe

In this work, we are developing probabilistic classifiers for two challenging and diverse NLP tasks using a common set of techniques. One classifier will be capable of disambiguating a large vocabulary of words with respect to a full set of sense distinctions from a published source, such as Longman's on-line dictionary. The second will perform a discourse processing task that involves segmentation, reference resolution, and belief: segmenting a text into blocks that express the beliefs and opinions of a single agent, and identifying noun phrases that refer to that agent. Both systems will be fully automatic.

Exploiting recent developments in applied statistics, we are using a richer class of statistical models than previously used in most NLP applications, along with a set of tools for (1) fitting such models to the data, (2) estimating the parameters of the chosen model from untagged data, (3) and resolving interdependent ambiguities. Together, these techniques will make it computationally feasible to automatically develop and apply probabilistic models that express a complex set of relationships among a diverse set of variables. This work will advance statistical NLP toward expressing and using the types of knowledge typically thought to be necessary for high-level NLP tasks.

This research is funded by the Office of Naval Research under grant number N00014-95-1-0776.