Tabula Rasa


Turning Text into Data

Tabula Rasa is an attempt to reduce two of the major bottlenecks of information extraction; defining text extraction tasks and developing tools to aid in producing structured data or templates. Tabula Rasa is a `meta-tool' that analysts can use to build tools that help with template filling tasks. One Tabula Rasa component, tredit, enables designers of information extraction tasks, like those used in ARPA's Message Understanding Conferences (MUC), to create and edit template definitions. Another Tabula Rasa component, the runtime tool-builder, uses these definitions to automatically generate Graphical User Interface (GUI) tools that analysts can use to create filled templates.

 Current automatic information extraction systems are usually inaccurate and have long development times. Even when the accuracy of the technology is adequate there is still a need for completed keys (filled templates) for training automatic systems and to allow system performance to be objectively tested. To produce these keys a human analyst must first carry out the template filling task. Tabula Rasa facilitates the production of keys for new domains by helping anal

To get more information about Tabula Rasa you can read the on-line Users Manual.  You can also download a Solaris or SunOs version of Tabula Rasa .
 

New Mexico State University. All rights reserved.