Strategic Research and Innovation Agenda
for the Multilingual Digital Single Market

Language as a Data Type and Key Challenge for Big Data

 

Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content

The Strategic Research and Innovation Agenda for the Multilingual Digital Single Market presents ideas, approaches and solutions in order to make the Digital Single Market, a flagship initiative of the European Union, multilingual. The current version of the document, "Language as a Data Type and Key Challenge for Big Data – Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content", was unveiled at META-FORUM 2016 on 4/5 July 2016.

Current version:

Older versions:

Executive Summary

The integration of the unified and connected Digital Single Market must address our languages: The Digital Single Market is a multilingual challenge! Our treasured multilingualism, one of the cultural cornerstones of Europe and what it means to be and to feel European, is also one of the main obstacles of a truly connected, language-crossing Digital Single Market. The European Language Technology community – including research, development, innovation and other relevant stakeholders – is committed to provide the technologies to achieve this goal.

We recommend setting up the highly focused three-year Multilingual Value Programme (MLV) to enable the Multilingual Digital Single Market. This focused programme will be guided by a comprehensive roadmap. It requires a small and modest investment, which can be realised through the Horizon 2020 ICT LEIT funding programme (2018-2020), in close collaboration with the Big Data Value Association (BDVA), i.e., the Big Data Value cPPP.

The MLV Programme consists of three application areas that relate to the three main pillars of the Multilingual Digital Single Market. (1) The area Multilingual E-Commerce provides multilingual and cross-lingual technologies around search, customer-relationship management, helpdesks, processes, workflows, product catalogues and descriptions etc. (2) The area Multilingual Content and Media assembles multilingual and crosslingual technologies for content analytics, curation and generation including authoring support, multimodal and social media. (3) The area Translation, Language, Knowledge, Data provides multilingual and crosslingual applications that connect Big Data technologies and Language as well as Knowledge Technologies including machine translation (written, spoken, automatic/human), text mining, business intelligence, sentiment analysis, domain-specific approaches and semantification. These applications are driven by several Multilingual Services, which are, in turn, fostered and further improved through Research. We also plan to intensify work on basic technologies so that we can cover all relevant languages. In addition, horizontal topics need to be addressed, e.g., standardisation, interoperability, and policy aspects.

The MLV Programme will not only unlock the multilingual Digital Single Market through a set of platforms, services and solutions that support all businesses and citizens, it will provide the European language technology community and several different industries with the ability to compete with other markets and achieve multiple benefits for the European economy and future growth, as well as for society and the citizens. To achieve this ambitious plan, all stakeholders need to collaborate and cooperate closely and in a tightly coordinated way. To demonstrate that the whole Multilingual Europe community firmly stands together, this document is presented by the Cracking the Language Barrier federation, which consists of 10 organisations and more than 20 projects working together on the technological foundations of a Multilingual Europe.

Awareness, political determination and will are required to make sure that the Digital Single Market takes the language component into account. VP Andrus Ansip’s recent blog post, “How multilingual is Europe’s Digital Single Market?” is a sign that the awareness is there – now it is simply a matter of making sure that the MLV Programme can be put into practice.

By realising the Multilingual Digital Single Market, the MLV Programme would solve the issue of language-blocking and language-induced market fragmentation. It would also reduce the threat of digital language extinction. We recommend that Europe actively makes an effort to compete in the global landscape for research and development in language technology since we cannot expect third parties from other continents to solve our translation and knowledge management problems in a way that suits our specific communicative, societal and cultural needs.

Language Technology made for Europe in Europe is the key. It will contribute to future European cross-border and cross-language communication, economic growth and social stability.