According to a study by Europe’s leading Language Technology (LT) experts, 21 of 30 studied European languages (70%) are in danger of digital extinction because the digital support for these languages is non-existent or weak at best . The verdict was based on the assessment of four areas: automatic translation, speech interaction, text analysis and the availability of language resources. Language resources are the required ingredient to develop the three mentioned types of higher-level Language Technology software. Such valuable resources are, however, only sparsely available even for the majority of the EU’s 23 official languages. The EU produces and owns vast amounts of multilingual resources that can be exploited to develop language-related applications. The EU is thus in a position where it can give the field of Language Technology a big boost. And it does fulfil its duty. What do the EU Institutions make available? And how can even simple text collections be used to develop Language Technology software? Is the Dutch language also threatened? These are some of the questions we try to answer in this article, but let’s start at the beginning.
STEINBERGER Ralf;
2013-11-21
NOTaS Foundation
JRC76317
http://www.notas.nl/nl/dixit.html,
http://notas.nl/images/stories/dixit/dixit_2012_bigdata.pdf,
https://publications.jrc.ec.europa.eu/repository/handle/JRC76317,
| Name | Country | City | Type |
|---|
This document is only visible at the Commission level.
You are not authorized to publish or distribute it outside the European Commission.
This is a public document. You can share this publication.
Datasets
| ID | Title | Public URL |
|---|
Dataset collections
| ID | Acronym | Title | Public URL |
|---|
Scripts / source codes
| Description | Public URL |
|---|
Additional supporting files
| File name | Description | File type |
|---|