 Click for
larger view
|
Today's Internet technology is allowing direct on-line access to locations all around the globe. As a result,
enormous quantities of multilingual information in the form of text are becoming available online every day.
This growing source of information makes possible the development of new kinds of sources for users looking
for very specific information. For example, users searching for information about the activities of a famous
person or a world leader, can retrieve hundreds if not thousands of documents very rapidly. These documents
may be written in languages the user does not know, which poses additional difficulties. Hence, to distill
the desired information into a single text a user will need to translate and then select what is relevant to
his or her needs. Certainly, having so many documents will require too much time reading and translating thus
canceling the benefits of fast information retrieval technology to the point of making impractical the entire
process of obtaining the relevant information. The aim of the work described here is to demonstrate a fully
automatic approach which generates personal profiles from multilingual documents retrieved from the Internet.
This has involved the integration of several multilingual tools - automatic language recognition, generic
multilingual summarization, machine translation, date recognition, to produce a system that generates personal
profiles. These profiles are lists of brief entries in English presented as HTML pages with links to the
summaries and documents, in the original languages. We have tested the system on 18 people.
An example of the current output of the system can be seen in Figure 1.
|