Extending an Information Extraction tool set to Central and Eastern European languages

Ignat, Camelia, Pouliquen, Bruno, Ribeiro, Antonio, Steinberger, Ralf

arXiv.org Artificial Intelligence 

In a highly multilingual and multicultural environment such as in the European Commission with soon over twenty official languages, there is an urgent need for text analysis tools that use minimal linguistic knowledge so that they can be adapted to many languages without much human effort. We are presenting two such Information Extraction tools that have already been adapted to various Western and Eastern European languages: one for the recognition of date expressions in text, and one for the detection of geographical place names and the visualisation of the results in geographical maps. An evaluation of the performance has produced very satisfying results.