Health in my language: Evaluating health-domain adapted machine translation for Cochrane plain language summaries

2017 Cape Town [Global Evidence Summit]

Ried J¹, Hassan H¹, de Haan S¹, Wood J¹, For the HimL consortium N²

¹Cochrane

²University of Edinburgh, Charles University Prague, Ludwig Maximilian University of Munich, NHS 24, Cochrane, and Lingea

Background: Health in my Language (HimL) is an EU-funded, three-year project. It aims to address the need for reliable and affordable translation of public health content into different languages via fully automatic machine-translation (MT) systems, initially focusing on translation from English into Czech, German, Polish, and Romanian. Recent advances in MT are used, including in domain adaptation, translation into morphologically rich languages, terminology management, and semantically enhanced MT. Cycles of incorporating improvements into the MT systems are being iterated annually, with careful evaluation and user acceptance testing. Health information produced by Cochrane and NHS24 serves as the test case, and will be translated in each cycle and also published on their websites.

Objectives: To evaluate the quality and to test the usability of the obtained machine translations; and to measure the effect on post-editing and web access.

Methods: Different automatic evaluation metrics are applied to assess quality. The planned human evaluation tasks are:
- annotation of semantic components to assess accuracy;
- ranking of MTs generated using different MT systems against each other;
- online survey to assess user acceptance;
- post-editing of MTs to measure speed compared to post-editing of baseline MTs and fully manual translation;
- text gap-filling to assess comprehension.

Web usage statistics will be collected to assess the effect on website access of the published MTs.

Results and conclusions: The second version of the MT system was deployed in September 2016, and human semantic annotation and ranking have been conducted, user testing is in progress. Results from ranking and annotation varied between different MT systems and different text types (i.e. Cochrane and NHS24 texts). The evaluation provided further guidance for the final iteration of system development. The third and final system will be deployed in September 2017. The 2016 evaluation results will be presented at the Summit, as well as an outlook on the 2017 evaluation plan.