Many of the languages not supported by our current technologies show common traits: they are morphologically complex, with free and diverse word order. Often there are not enough training resources and/or processing tools. Together this results in drastic drops in translation quality. The combined challenges of linguistic phenomena and resource scenarios have created a large and under-explored grey area in the language technology map of European languages. Combining support from key stakeholders, QT21 addresses this grey area developing
- substantially improved statistical and machine-learning based translation models for challenging languages and resource scenarios,
- improved evaluation and continuous learning from mistakes, guided by a systematic analysis of quality barriers, informed by human translators,
- all with a strong focus on scalability, to ensure that learning and decoding with these models is efficient and that reliance on data (annotated or not) is minimised.
To continuously measure progress, and to provide a platform for sharing and collaboration (QT21 internally and beyond), the project revolves around a series of Shared Tasks, for maximum impact co-organised with the annual workshops on machine translation (WMT).